-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rsdk 9697 support multiple nvs #376
Rsdk 9697 support multiple nvs #376
Conversation
} | ||
|
||
#[derive(Clone)] | ||
pub struct NVSStorage { | ||
// esp-idf-svc partition driver ensures that only one handle of a type can be created | ||
// so inner mutability can be achieves safely with RefCell | ||
nvs: Rc<RefCell<EspCustomNvs>>, | ||
partition_name: CString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store a copy of the partition rather than the partition name to avoid cloning a CString as we pass the Storage through different par to of the code
fn reset_ota_metadata(&self) -> Result<(), Self::Error>; | ||
} | ||
|
||
pub trait StorageDiagnostic { | ||
fn log_space_diagnostic(&self); | ||
} | ||
|
||
#[derive(Error, Debug)] | ||
#[error("empty storage collection")] | ||
pub struct EmptyStorageCollectionError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a special error to handle the case where a collection is empty
@@ -158,6 +163,22 @@ struct RAMCredentialStorageInner { | |||
#[derive(Default, Clone)] | |||
pub struct RAMStorage(Rc<Mutex<RAMCredentialStorageInner>>); | |||
|
|||
#[derive(Error, Debug)] | |||
pub enum RAMStorageError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an error type for RAMStorage to make it easier to test
self.into_iter().any(OtaMetadataStorage::has_ota_metadata) | ||
} | ||
fn get_ota_metadata(&self) -> Result<OtaMetadata, Self::Error> { | ||
self.into_iter().fold( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea here is to use fold to call get_ota_metadata
until Ok is returned. If all storage in a collection return an error only the last error is reported.
We don't have an iterator that behave like try_fold
but actually short-circuit on the first Ok, so we will iterate through the whole collection even after a get_ota_metadata
has succeeded. The function get_ota_metadata
will however not be called anymore on storage
{ | ||
type Error = Storage::Error; | ||
fn has_ota_metadata(&self) -> bool { | ||
self.into_iter().any(OtaMetadataStorage::has_ota_metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
short-circuit on the first true
#[cfg(feature = "ota")] | ||
impl<Iterable, Storage: OtaMetadataStorage> OtaMetadataStorage for Iterable | ||
where | ||
for<'a> &'a Iterable: IntoIterator<Item = &'a Storage>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will only work on collections that implements impl<'a, T,> IntoIterator for &'a ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one small concern here. I don't have a fully worked counterexample, but I wonder a bit about using something like a HashSet as the type backing IntoIterator, because the iteration order isn't deterministic.
@@ -136,7 +137,7 @@ impl StorageDiagnostic for NVSStorage { | |||
fn log_space_diagnostic(&self) { | |||
let mut stats: nvs_stats_t = Default::default(); | |||
if let Err(err) = | |||
esp!(unsafe { nvs_get_stats(self.partition_name.as_ptr(), &mut stats as *mut _) }) | |||
esp!(unsafe { nvs_get_stats(self.part.handle() as _, &mut stats as *mut _) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handle is the partition name
fn reset_ota_metadata(&self) -> Result<(), Self::Error> { | ||
self.into_iter().fold( | ||
Err::<_, Self::Error>(EmptyStorageCollectionError.into()), | ||
|val, s| val.or(s.reset_ota_metadata()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking advantage of the eagerness of or might not be the best approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate a bit? I think I know what you are saying, but I'd rather not guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As opposed to or_else
or
will always evaluate a function in argument if there is one (because it doesn't actually accept a closure) before considering replacing val
hence the eagerness. We use it our advantage to make sure all copies of an object are erased through all storage. In theory we shouldn't need this since we use or_else elsewhere so only on copy should be in the storage array at anytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat, LGTM, a few small comments. I'd like to think about it a bit tomorrow before it merges so I can make sure I totally understand how the various folds are working and think about whether there are any tricky edge cases.
let mut inner_ref = self.0.lock().unwrap(); | ||
let _ = inner_ref.robot_creds.insert(creds); | ||
Ok(()) | ||
} | ||
fn get_robot_credentials(&self) -> Result<RobotCredentials, Self::Error> { | ||
let inner_ref = self.0.lock().unwrap(); | ||
let cfg = inner_ref.robot_creds.clone().unwrap_or_default().clone(); | ||
Ok(cfg) | ||
log::info!("get robot creds"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray log, or debug!
?
As an aside, I do miss some of the logging we used to have pre #316 that gave a little more context about progression through the startup sequence, whether we were using cached credentials or configs, etc. I'm usually for "silence is golden" but in this case that's very useful information to understand how things are coming up, especially in offline mode. Would you be OK with a ticket to re-introduce some of that logging? Or did you not propagate into the refactored world because you found it too chatty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stray log i will remove it.
Yes let's make a ticket for some logging housekeeping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, in https://viam.atlassian.net/browse/RSDK-9715. Feel free to add more things you'd like to see added to a logging pass.
} | ||
fn reset_robot_credentials(&self) -> Result<(), Self::Error> { | ||
let mut inner_ref = self.0.lock().unwrap(); | ||
let _ = inner_ref.robot_creds.take(); | ||
log::info!("reset robot creds"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above
assert!(!v.has_robot_credentials()); | ||
assert!(ram2 | ||
.store_robot_credentials(&CloudConfig { | ||
app_address: "http://downloadramstorage.org".to_owned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe .example.org
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that was for the joke :P
fn reset_ota_metadata(&self) -> Result<(), Self::Error> { | ||
self.into_iter().fold( | ||
Err::<_, Self::Error>(EmptyStorageCollectionError.into()), | ||
|val, s| val.or(s.reset_ota_metadata()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate a bit? I think I know what you are saying, but I'd rather not guess.
How do you envision configuring the space of available NVS, for, say, callers of |
What do you mean? |
As discussed in-person, I was interested in how you saw the ability to configure the selection of available partitions expressed in the FFI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I'm interested in your thoughts about my iteration order worry.
micro-rdk/src/esp32/nvs_storage.rs
Outdated
self.set_string(NVS_ROBOT_ID_KEY, &cfg.id)?; | ||
self.set_string(NVS_ROBOT_APP_ADDRESS, &cfg.app_address)?; | ||
self.set_string(NVS_ROBOT_ID_KEY, &cfg.id).or_else(|err| { | ||
let _ = self.reset_robot_credentials(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- By using
reset_robot_credentials
, I think there is now a requirement that it erase keys in the same order thatstore_robot_credentials
writes them, since a failederase_key
ends the method. Maybe it is better to explicitly just unset the keys as acted on so far in thestore
methods rather than relying on ordering in a different method? Accumulate some guards? - ⬇️ - Is it a bug that
reset_robot_credentials
doesn't clearNVS_ROBOT_APP_ADDRESS
?
micro-rdk-ffi/src/ffi/runtime.rs
Outdated
@@ -284,7 +315,22 @@ pub unsafe extern "C" fn viam_server_start(ctx: *mut viam_server_context) -> via | |||
#[cfg(not(target_os = "espidf"))] | |||
let storage = micro_rdk::common::credentials_storage::RAMStorage::default(); | |||
#[cfg(target_os = "espidf")] | |||
let storage = micro_rdk::esp32::nvs_storage::NVSStorage::new("nvs").unwrap(); | |||
//let storage = {micro_rdk::esp32::nvs_storage::NVSStorage::new("nvs").unwrap()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented line
micro-rdk-ffi/src/ffi/runtime.rs
Outdated
@@ -257,7 +288,7 @@ pub unsafe extern "C" fn viam_server_start(ctx: *mut viam_server_context) -> via | |||
use micro_rdk::common::credentials_storage::RAMStorage; | |||
use micro_rdk::common::credentials_storage::RobotConfigurationStorage; | |||
use micro_rdk::proto::provisioning::v1::CloudConfig; | |||
|
|||
log::info!("when the FFI was built a robot config was supplied, defaulting to this robot"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe take this out for now but add a TODO(RSDK-9715)
which is my init logging ticket so we can do this in coordination with other startup logging changes?
@@ -310,6 +310,18 @@ void app_main(void) | |||
return; | |||
} | |||
|
|||
ret = viam_server_add_nvs_storage(viam_ctx, "nvs"); | |||
if (ret != VIAM_OK) { | |||
ESP_LOGE(TAG,"couldn't set add nvs partition, error : %i", ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after ,
here and below please
I think the fix you've made here about ensuring that writes of multiple keys are all-or-nothing in a given partition is necessary. But I'm still a little concerned about iteration order determinism over the set of storages. Per my slack message:
|
I think this can be an issue when storing config or certificate only since we don't explicitly erase these objects when storing them during viam server initialization. We implicitly expect the storage driver to overwrite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Implements the storage traits for any type we can iterate over, we can then easily pass an array of a vec as a storage to viam server.
With this something like
let storage = vec![ NVSStorage::new("nvs").unwrap(), NVSStorage::new("nvs2").unwrap(), ];
Will allow use to store the robot data over multiple NVS
I didn't want to add an extra type to represent a collection of storage and this way the logic is driven by the underlying storage implementation.