Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APP-7635]: Define type for managing bluetooth WiFi provisioning. #62

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

maxhorowitz
Copy link
Member

@maxhorowitz maxhorowitz commented Feb 14, 2025

Summary

Add type (without implementing a low-level Linux solution) that can be "plugged-in" to the Agent provisioning flow for spinning up BT, waiting for credentials, and cleanly shutting itself down.

Included

  1. APP-7635: Define type for bluetooth provisioning, which does not include any low-level bluetooth-stack specific assumptions about how we will read/write values to bluetooth characteristics on the peripheral.
  2. I have NOT added unit tests. I will do so once we can agree upon the public implementation added in this PR.

The following functionality is added in this PR:

type bluetoothWiFiProvisioner struct {...}
func NewBluetoothWiFiProvisioner(...) (*bluetoothWiFiProvisioner, error)
func (bwp *BluetoothWiFiProvisioner) Start(ctx context.Context) error {...}
func (bwp *BluetoothWiFiProvisioner) Stop() error {...}
func (bwp *BluetoothWiFiProvisioner) RefreshAvailableNetworks(ctx context.Context, awns []*NetworkInfo) error {...}
func (bwp *BluetoothWiFiProvisioner) WaitForCredentials(ctx context.Context) (*userInput, error) {...}

Example usage

The following code snippet describes how I plan to use the functionality added in this PR.

func bluetoothWiFiProvisioningExample(ctx context.Context, logger logging.Logger, availableWiFiNetworksChan chan []*NetworkInfo) {
	ctx, cancel := context.WithCancel(ctx)
	defer cancel()

	bwp, err := ble.NewBluetoothWiFiProvisioner(ctx, logger, "Viam Agent") // STEP 1: Create instance that implements interface.
	if err != nil {
		logger.Fatal(err)
	}

	go func() {  // STEP 2: Run background job to update available WiFi networks.
		for {
			if ctx.Err() != nil {
				return
			}
			select {
			case <-ctx.Done():
				return
			case awn := <-availableWiFiNetworksChan:
				// 
				// Example []*NetworkInfo value (passed in via channel):
				// 
				// awn := []*NetworkInfo{
				// 		{
				// 			SSID:        "HomeWiFi",
				// 			Signal:    85,
				// 			Security: "WPA",
				// 		},
				// 		{
				// 			Ssid:        "GuestWiFi",
				// 			Signal:    50,
				// 			Security: "None",
				// 		},
				// 	},
				// }
				// 
				if err := bwp.RefreshAvailableNetworks(ctx, awn); err != nil {
					logger.Error(err)
					return
				}
			default:
				time.Sleep(time.Second * 3)
			}
		}
	}()

	if err := bwp.Start(ctx); err != nil { // STEP 3: Start advertising BLE to a client.
		logger.Fatal(err)
	}

	// Interface method naming implies this should be a blocking call.
	userInput, err := bwp.WaitForCredentials(ctx, true, true) // STEP 4: Wait for the client to send credentials over BT.
	if err != nil {
		logger.Fatal(err)
	}
	logger.Infof("User provided SSID: %s and Psk: %s, will attempt to connect with those WiFi credentials. "+
		"Once connected, will provision this device with the following cloud config: Robot Part Key ID: %s, "+
		"Robot Part Key: %s", userInput.SSID, userInput.PSK, userInput.PartD, userInput.Secret)

	if err := bwp.Stop(); err != nil { // STEP 5: Close BLE advertisement (tears down client connection if exists).
		logger.Fatal(err)
	}
}

@maxhorowitz maxhorowitz changed the title [APP-7635][APP-7637]: Define interfaces for bluetooth provisioning and for bluetooth service, which, together, comprise an OS-agnostic solution for bluetooth WiFi/robot provisioning on the Agent. [APP-7635][APP-7637]: Define interfaces for bluetooth provisioning and for bluetooth service. Feb 14, 2025
@maxhorowitz
Copy link
Member Author

Note to reviewers: once we can agree upon the interface(s) added here, I will add unit testing to make it lock tight.

@maxhorowitz maxhorowitz force-pushed the APP-7635-APP-7637 branch 3 times, most recently from f5f1be7 to 5d57942 Compare February 14, 2025 19:11
"fmt"
)

type bluetoothService interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is against Golang best practice to have unexported interfaces, and I'm far from an expert using generic types, so let me know if there are better ways to be designing this feature.

The purpose of this layer of abstraction was to protect the Linux-specific BT stuff from the core implementation of our BluetoothWiFiProvisioner interface. It isn't exported because I don't expect anyone to need it. If you have more specific questions, let me know, and I'll dive in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tough for me to tell what's going on here without the implementer of this bluetoothService interface, but I'm definitely suspicious of the need for any interface at all.

Will you have multiple implementers of bluetoothService? Or multiple of BluetoothWiFiProvisioner?

Using interfaces to have a "pimpl" (private implementation + public methods) design is definitely against Go best practices, but maybe I'm misunderstanding what you're doing here. Interfaces are generally helpful for accepting multiple types of things that implement a common set of methods, not for hiding implementations. Apologies if you know these things already, but figured I'd come out strong and say I'm suspicious of the patterns you're using here wrt to the two interfaces (and one that's unexported.)

I think my concern extends to your use of generics, as they seem to be how you're getting around your usage of interfaces.

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diving in --- and definitely appreciate explanations of pimpl and best practices because I'm always looking to level up in Golang. I've used pimpl interfaces in the past because I liked how it made testing modular, so that's what I was going for --- and happy to learn that's not how it should be done.

I added the bluetoothService interface with the idea that some day we could support more than just Linux. The implementation I had in mind was linuxBluetoothService, windowsBluetoothService, etc., so that we could abstract away OS-specific bluetooth differences from the business logic inside of BluetoothWiFiProvisioner. That being said, it is super unlikely we get to that point soon. So the only real reason I could have for adding it is to make unit tests not reliant on the OS, but even that seems like overkill.

Changes I will make:

  1. Remove the BluetoothWiFiProvisioner interface and make the corresponding private impl public
  2. Remove the bluetoothService interface and instead add the methods I had in mind as private methods on the BluetoothWiFiProvisioner type so they're only accessible here

}

// BluetoothWiFiProvisioner provides an interface for managing the bluetooth (bluetooth-low-energy) service as it pertains to WiFi setup.
type BluetoothWiFiProvisioner interface {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the exported interface which is to be consumed in the provisioning loop. In the interest of not "muddying" our provisioning loop flow, I keep this interface very simple. It only has things that you would need from the perspective of the provisioning loop, and it doesn't include anything about BT peripheral advertisement, characteristics, etc.

I get into the more "under-the-hood" functionality in the bluetooth_service.go file. There, I also have an interface so I can abstract away all Linux OS stuff.

Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing some of the picture here, but I thought I'd leave a round of comments before I'm OOO for most of next week. Feel free to ignore anything, too, more @Otterverse's + your code than mine.

"fmt"
)

type bluetoothService interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tough for me to tell what's going on here without the implementer of this bluetoothService interface, but I'm definitely suspicious of the need for any interface at all.

Will you have multiple implementers of bluetoothService? Or multiple of BluetoothWiFiProvisioner?

Using interfaces to have a "pimpl" (private implementation + public methods) design is definitely against Go best practices, but maybe I'm misunderstanding what you're doing here. Interfaces are generally helpful for accepting multiple types of things that implement a common set of methods, not for hiding implementations. Apologies if you know these things already, but figured I'd come out strong and say I'm suspicious of the patterns you're using here wrt to the two interfaces (and one that's unexported.)

I think my concern extends to your use of generics, as they seem to be how you're getting around your usage of interfaces.

}

// Update updates the list of networks that are advertised via bluetooth as available.
func (bwp *bluetoothWiFiProvisioner[T]) RefreshAvailableNetworks(ctx context.Context, awns *AvailableWiFiNetworks) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method probably doesn't need a context if it's not passing it anywhere and is just immediately checking if it has errored.

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, removed from there and passing it a layer below.

wg.Add(2)
utils.ManagedGo(
func() {
ssid, ssidErr = waitForBLEValue(ctx, bwp.svc.readSsid, "ssid")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of these four goroutines errors, do you want the other goroutines to stop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - yes. That's right because if it enters that line it means we're requiring that value to be read properly.

}

// waitForBLE is used to check for the existence of a new value in a BLE characteristic.
func waitForBLEValue(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] This function isn't really waiting for any BLE value, it's just calling a callback in a loop and continuing if there's a certain type of error. I wonder if the name should reflect that a bit better.

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - will make it more accurate and less bias to how its used.

"sync"
"time"

"github.com/pkg/errors"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Not sure what agent's pattern is, but generally the standard library errors is a little more idiomatic.

switch os := runtime.GOOS; os {
case "linux":
fallthrough
case "windows":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls no.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅

@maxhorowitz maxhorowitz changed the title [APP-7635][APP-7637]: Define interfaces for bluetooth provisioning and for bluetooth service. [APP-7635][APP-7637]: Implement type for managing bluetooth WiFi provisioning. Feb 18, 2025
Comment on lines 10 to 12
"errors"

errw "github.com/pkg/errors"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied this (including the errw prefix) from elsewhere in the Agent. The "github.com/pkg/errors" package has errors.WithMessage(err, "some wrapping message") and errors.Errorf("message with arg %s", arg) (which do not exist in the standard "errors" package).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that using errw is just so we can migrate old code without too much hassle, but for new code you can use fmt.Errorf

// An easy way to create wrapped errors is to call [fmt.Errorf] and apply
// the %w verb to the error argument:
//
//	wrapsErr := fmt.Errorf("... %w ...", ..., err, ...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Didn't know it was doable that way. Will change to this, thanks.

@maxhorowitz maxhorowitz changed the title [APP-7635][APP-7637]: Implement type for managing bluetooth WiFi provisioning. [APP-7635]: Define type for managing bluetooth WiFi provisioning. Feb 18, 2025
…d and rename waitForBLEValue to retryCallbackOnEmptyCharacteristicError.
@maxhorowitz maxhorowitz requested a review from ale7714 February 18, 2025 20:21

// AvailableWiFiNetworks represent the networks that the device has detected (and which may be available for connection).
type AvailableWiFiNetworks struct {
Networks []*struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can re-use this type

type NetworkInfo struct {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the above with regards to import cycles. Can use, but would need to pull that out. Otherwise, maybe we should just be moving this code into the existing provisioning package as its own file where it can also access those values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into provisioning and reused this type.

)

// Credentials represent the minimum required information needed to provision a Viam Agent.
type Credentials struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets reuse

type userInput struct {

These types will be the same in the BLE and the Hotspot flow

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use that type but will have to move it out of the provisioning package and into an isolated package where both can import it (to get around an import cycle). Does that sound like something you would want here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into provisioning and reused this type.

if err := bwp.startAdvertisingBLE(ctx); err != nil {
return err
}
bwp.enableAutoAcceptPairRequest() // Async goroutine (hence no error check) which auto-accepts pair requests on this device.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean we won't provide helpful logging to the users? or that part will be implemented in enableAutoAcceptPairRequest?

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging is included in the implementation of enableAutoAcceptPairRequest. The stacked PR containing its impl is here: #69.

@maxhorowitz maxhorowitz requested a review from cheukt February 18, 2025 21:53
wg.Wait()
return &Credentials{
Ssid: ssid, Psk: psk, RobotPartKeyID: robotPartKeyID, RobotPartKey: robotPartKey,
}, multierr.Combine(ssidErr, pskErr, robotPartKeyIDErr, robotPartKeyErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use errors.Join

Comment on lines 74 to 120
if requiresWiFiCredentials {
wg.Add(2)
utils.ManagedGo(
func() {
if ssid, ssidErr = retryCallbackOnExpectedError(
ctx, bwp.readSsid, &emptyBluetoothCharacteristicError{}, "failed to read ssid",
); ssidErr != nil {
cancel()
}
},
wg.Done,
)
utils.ManagedGo(
func() {
if psk, pskErr = retryCallbackOnExpectedError(
ctx, bwp.readPsk, &emptyBluetoothCharacteristicError{}, "failed to read psk",
); pskErr != nil {
cancel()
}

},
wg.Done,
)
}
if requiresCloudCredentials {
wg.Add(2)
utils.ManagedGo(
func() {
if robotPartKeyID, robotPartKeyIDErr = retryCallbackOnExpectedError(
ctx, bwp.readRobotPartKeyID, &emptyBluetoothCharacteristicError{}, "failed to read robot part key ID",
); robotPartKeyIDErr != nil {
cancel()
}
},
wg.Done,
)
utils.ManagedGo(
func() {
if robotPartKey, robotPartKeyErr = retryCallbackOnExpectedError(
ctx, bwp.readRobotPartKey, &emptyBluetoothCharacteristicError{}, "failed to read robot part key",
); robotPartKeyErr != nil {
cancel()
}
},
wg.Done,
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pattern feels odd to me - why are we starting 4 goroutines that's each calling 1 function in a loop? Something useful for context would be to know on a high level how each of the functions will work

Copy link
Member Author

@maxhorowitz maxhorowitz Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify handling asynchronous Bluetooth reads, I created a function that calls readSsid, readPsk, readRobotPartKeyID, and readRobotPartKey in retrying loops (1x per second). Since these values can arrive at any time—or may never be set—we want to wait for each one in parallel.

Rather than leaving this logic to the caller, which could be error-prone, this function ensures all values are retrieved before returning. And in the error case, it cancels context for all. Each function checks an in-memory store for existing data.

The requiresWiFiCredentials and requiresCloudCredentials booleans add flexibility, allowing callers to request only WiFi credentials, cloud credentials, or both as needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at https://github.com/viamrobotics/agent/pull/68/files so let me know if I'm missing something.

It seems possible to only have 1 goroutine that calls readSsid, readPsk, readRobotPartKeyID, and readRobotPartKey sequentially and then retry every second, is there a reason why not?

"go.viam.com/utils"
)

// bluetoothWiFiProvisioner provides an interface for managing BLE (bluetooth-low-energy) peripheral advertisement on Linux.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// bluetoothWiFiProvisioner provides an interface for managing BLE (bluetooth-low-energy) peripheral advertisement on Linux.
// bluetoothWiFiProvisioner provides methods for managing BLE (bluetooth-low-energy) peripheral advertisement on Linux.

}
v, err := fn()
if err != nil {
if errors.As(err, &expectedErr) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self to fix this and manually test the error matching works.

Comment on lines 170 to 171
func retryCallbackOnExpectedError(
ctx context.Context, fn func() (string, error), expectedErr error, description string,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func retryCallbackOnExpectedError(
ctx context.Context, fn func() (string, error), expectedErr error, description string,
func retryCallbackOnExpectedError[T any](
ctx context.Context, fn func() (T, error), expectedErr error, description string,

@maxhorowitz
Copy link
Member Author

maxhorowitz commented Feb 19, 2025

@cheukt can't seem to get out of "review" mode on my PR which is preventing me from commenting on your comments.

I'm looking at https://github.com/viamrobotics/agent/pull/68/files so let me know if I'm missing something. It seems possible to only have 1 goroutine that calls readSsid, readPsk, readRobotPartKeyID, and readRobotPartKey sequentially and then retry every second, is there a reason why not?

No reason not to do that, other than we will have to add additional state in this function for "skipping" a read if we've already received a value for it.

Copy link
Member Author

@maxhorowitz maxhorowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for self.

@cheukt
Copy link
Member

cheukt commented Feb 19, 2025

ok, then we should prefer less goroutines - we can probably not spin a new one off at all and just use a for loop in the Wait function. I would avoid goroutines unless necessary, it introduces complexity and adds a ton of mental overhead for any future reader

…unexported interface so that the BT functionality can be mocked from provisioning tests.
… methods that exist on its implementation from the calling code (in networkmanager.go).
Copy link
Member

@Otterverse Otterverse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is far from finished, but left some initial comments. (Apologies if you already know these things and just haven't gotten to them yet.)


// NewBluetoothWiFiProvisioningServiceLinux returns a service which accepts credentials over bluetooth to provision a robot and its WiFi connection.
func NewBluetoothWiFiProvisioningService(ctx context.Context, logger logging.Logger, name string) (*bluetoothServiceLinux, error) {
switch os := runtime.GOOS; os {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we don't want to do per-OS logic at runtime. Go has a pretty clever mechanism (build constraints) for making different versions of a file (and therefore the functions it contains) build only under certain conditions. E.g. the "windows" build would include only the windows code, and the regular/linux file would include the linux versions. That keeps code cleaner and resulting binaries smaller as well.

See https://pkg.go.dev/cmd/go#hdr-Build_constraints for details, but feel free to ask me for more examples as we've used it quite a bit in RDK itself.

}

// Not ready to return (do not have the minimum required set of credentials), so sleep and try again.
time.Sleep(time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use time.Sleep in agent/subsystems, it can cause health checks to fail. If you're inside a loop, use the mainLoopHealth/bgLoopHealth.Sleep(ctx) function instead

Comment on lines +48 to +51
wg := sync.WaitGroup{}
wg.Add(1)

utils.ManagedGo(func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you're backgrounding this, only to wg.Wait() right afterwards.

wg.Wait()

return &userInput{
SSID: ssid, PSK: psk, PartID: robotPartKeyID, Secret: robotPartKey,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of individual variables to have be shared into the lambda. Why not just declare &userInput{} itself and set it's contents inside the lambda?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants