-
Notifications
You must be signed in to change notification settings - Fork 29
Conversation
This is still a work in progress, but I think the approach is about right. Includes a rewrite of fuse/file to broadly mimic https://github.com/bazil/bolt-mount/blob/master/file.go, as suggested in bazil/fuse#225. Remaining work:
|
ffc29d6
to
8458a1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing, just left a few initial comments.
75c31ea
to
03a397d
Compare
I think this is ready for review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments, still reviewing
This is a developer-centric option that has to be explicitly enabled with an environment variable. By default it mounts `/tmp`, can be configured with `local.basepath` to mount any other directory. Signed-off-by: Michael Smith <[email protected]>
Implement the new Writable interface (defined to mirror the Readable interface). Implement Write on a local file interface for testing. Also adds fuse/file tests. Supports puppetlabs-toy-chest#620. Signed-off-by: Michael Smith <[email protected]>
This makes editors that rely on size read the entire file. Saves size from the last open on the fs node so we can use it to serve attribute requests. Fixes puppetlabs-toy-chest#656. Signed-off-by: Michael Smith <[email protected]>
Also use a context for S3 get object requests. Signed-off-by: Michael Smith <[email protected]>
Signed-off-by: Michael Smith <[email protected]>
Signed-off-by: Michael Smith <[email protected]>
Add `sizeValid` to track whether we need to call `plugin.Size` before using `size`. This is only needed when opening a file WriteOnly to delay a Read - to ensure we have the entire file content before Write - so we only do it when necessary. Handle `io.EOF` when loading the file in case a call to `Setattr` increased it beyond the original size but didn't write to fill that space, and pad with null characters when that happens. Upgrade journal entries to warnings when they result in an error. FUSE doesn't relay the actual error messages. Delay releasing data and clearing entry cache until we release all writers. Tweak a few tests so we have a balance of ReadWrite tests that rely on attribute size or calling Read to determine size. Update TestTruncateAndWrite to ensure we don't call Read when we've called Setattr to truncate the file first. Add additional comments explaining how to use BlockReadable and Writable, and some code comments to clarify how Open works. Signed-off-by: Michael Smith <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the conversation on Slack, we want a way to distinguish between file-like
and non-file-like
entries.
-
File-like
entries are entries with content, such that Read returns that content and Write updates it.Non-file-like
entries are entries without content, so that "Read" data could represent one thing and the "Written" data could represent another. -
For plugin author docs:
file-like
entries are entries with aSize
attribute (implicitly entry is also Readable). So an entry has content iff it has aSize
attribute. Thus, people should not set size fornon-file-like
entries.- Reason:
Size
attribute means you can filter those things withfind
. When would filtering onSize
make sense for Docker container logs, cloud function logs, etc?- We could always change these semantics later to a
is-file-like
key in the entry JSON on user demand
- We could always change these semantics later to a
- Reason:
-
Implementation:
-
File-like
entries areReadable
,Writable
, and have aSize
attribute.- Reason: Up to now, we tell people “set size if you know it” for Readable entries. We also do that for things like
metadata.json
entries, which are notfile-like
. These semantics ensure that those entries don't accidentally becomefile-like
.
- Reason: Up to now, we tell people “set size if you know it” for Readable entries. We also do that for things like
-
For
file-like
entries,f.data
represents the content. Fornon-file-like
entries,f.data
represents what's about to be written. -
For
file-like
entries, the firstWrite
call loads the entire data (because we're going to overwrite it anyways on aFlush
).f.data
is then updated appropriately with the written block. -
Flush
should just write whateverf.data
is. -
For
Read
,f.data
should only be returned forfile-like
entries (and if it exists). Otherwise, we should delegate toplugin.ReadWithAnalytics
. That's becausef.data
represents the content forfile-like
entries while for other entries, it's separate from what could be read.
-
One example of a non-file like entry that may have Read/Write could be a GCP log, where read could return its most recent 1000 entries while write could write a new entry. |
Updated. |
Clarifies difference in how files with and without Size attributes work by defining them as file-like and non-file-like and describing what that means. Primarily enables interaction with files where read and write are not symmetric. Signed-off-by: Michael Smith <[email protected]>
dee1c8d
to
8fc111e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing the tests next
docs/docs/index.md
Outdated
### write | ||
The `write` action lets you write data to an entry. Thus, any command that writes a file also works with these entries. | ||
|
||
Note that Wash distinguishes between file-like and non-file-like entries. An entry is file-like if it's readable and writable and defines its size; you can edit it like a file. If it doesn't define a size then it's non-file-like, and trying to open it with a ReadWrite handle will error; reads from it may not return data you previously wrote to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also add a note about checking docs for that entry's write semantics (non-file-like entries only)
fuse/file.go
Outdated
return &file{fuseNode: newFuseNode("f", p, e), writers: make(map[fuse.HandleID]struct{})} | ||
} | ||
|
||
func (f *file) isFileLike() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe isFileLike
=> isFileLikeEntry
? It's slightly weird seeing file#isFileLike
, esp. if you're new to this code.
74e4ad5
to
7ba3196
Compare
interface allows sending data to the entry. | ||
|
||
Wash distinguishes between two different patterns for things you can read and write. It considers | ||
a "file-like" entry to be one with a defined size (so the `size` attribute is set when listing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I agree about using size
alone for file-like things. But I think we should remove size
from the metadata/data.json
files because those aren't really "files" (editing them makes no sense) so they shouldn't be treated as such unless we have a good reason to do so in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think editing metadata does make sense for some things: the metadata is a reflection of a mix of state and configuration, and you may be able to make changes to that configuration. Editing those values to change the configuration would make sense. This is especially true for everything in Kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but what you write isn't going to strictly match what's read -- there isn't a 1:1 symmetry between the content because we output pretty-printed JSON. But I don't have strong feelings about keeping size around either so up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense.
plugin/types.go
Outdated
// entry is only Writable, then only full writes (starting from offset 0) are | ||
// allowed, anything else initiated by the filesystem will result in an error. | ||
// | ||
// It's up to the implementer to decide how much data integrity to guarantee. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "data integrity"? Clarifying that would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primarily around whether writing volumes makes sure an fsync happens so that data is serialized to persistent storage before returning. Not sure how to summarize that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok. Sounds like an issue for file-like entries, which could be summed up as plugin.Write(data); data == plugin.Read
(ensure that the next Read from the API will fetch the written data). Don't know if that implies "always call fsync" for volumes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up removing this because I don't think any plugins should explicitly implement something like calling sync
after writing. It represents a possible loss of data, but the machine would have to die right after and we'd likely get a connection error if that happened. Things like vim and databases will call Fsync, and we're currently lying to them in some cases that it did something, but for now I'd suggest don't run a database on Wash's filesystem.
plugin/externalPluginEntry_test.go
Outdated
|
||
var _ = internal.Command(&mockCommand{}) | ||
|
||
func (suite *ExternalPluginEntryTestSuite) TestWrite() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does Write have so many tests? Seems like it should be symmetric with other InvokeAndWait
tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to spend a little time on this tomorrow. I'm not sure the other tests are verifying all the assertions we expect. They also mock at a higher level (the script's InvokeAndWait call) than Write had available originally.
5531a82
to
adc7bc3
Compare
Address review comments to try and clarify code. Signed-off-by: Michael Smith <[email protected]>
Moves script `InvokeAndWait` behavior to a function on the invocation so we can setup more complicated invocations before running them. Keeps the script `InvokeAndWait` as a helper for the most common case. Signed-off-by: Michael Smith <[email protected]>
Don't set size on console output and metadata files. The plugin system knows how to get the size just as efficiently, and explicitly setting the size implies they'll behave like files (which may not be true). Signed-off-by: Michael Smith <[email protected]>
Implement the new Writable interface (defined to mirror the Readable interface). Implement Write on simple file interfaces: local, AWS S3, and Google Cloud Storage.
Resolves #620. Also fixes #656.