Extended docs4understanding to cover how the contents of storage.py

bnlawrence · bnlawrence · commit d62a3a685265 · 2022-10-01T08:58:51.000+01:00
might be deployed across hardware
diff --git a/docs4understanding/active.md b/docs4understanding/active.md
@@ -9,18 +9,23 @@ In this instance, we would expect the OSS to implement something with the semant
 
 ```plantuml
 package "hardware view 1" {
-
+  left to right direction
     component storage {
         database ost1
         database ost2  
         database ost3 
-        port OSS
+        component OSS
+        OSS -- ost1
+        OSS -- ost2
+        OSS -- ost3
     }
     
     component node {
         component application 
-        port kernel
+        component kernel
+        application --> kernel
     }
+    
     kernel --> OSS: LAN traffic.
     
 }
@@ -36,8 +41,8 @@ An example of application parallelism would be the use of asynchronous requests
 THe more interesting problem for now arises in where we have "storage parallelism". 
 
 ```plantuml
-package "hardware view 1" {
-
+package "hardware view 2" {
+    left to right direction
     component storage {
         database ost1
         database ost2  
@@ -46,17 +51,18 @@ package "hardware view 1" {
     
     component node {
         component application 
-        port kernel
+        component kernel
+        application --> kernel
     }
     kernel --> ost1: LAN
     kernel --> ost2: LAN
     kernel --> ost3: LAN
 }
 ```
 
-Where would we implement what? There is no benefit to be gained from implementing `_decode_chunks` in the kernel, as the entire chunk has already been served to the compute node, and no data movement has been avoided. Depending on the layout of chunks across OSTs, there may be no benefit in attempting active storage. It depends on how contiguous the data is in storage. If however, it is possble to 
-break down the logic of `_from_storage` so that individual "part_chunks" which are contiguous are processed on the storage side of the LAN (in the ost or nearby) then meaningful performance is possible.
-The Python client in the end would simply see a list of partial products of the computational method which have come direct from the osts. It will not care whether those parts came from a different breakdown of storage than it anticpated in the chunking (though of course the storage will need to do the necessary mapping to generate the partial sums).
+Where would we implement what? There is no benefit to be gained from implementing `_decode_chunks` in the kernel, as the entire chunk has already been served to the compute node, and no data movement has been avoided. Depending on the layout of chunks across OSTs, there may be no benefit in attempting active storage. It depends on how contiguous the data is in storage. 
+
+If however, it is possble to break down the logic of `_from_storage` so that individual "part_chunks" which are contiguous are processed on the storage side of the LAN (in the ost or nearby) then meaningful performance is possible. The Python client in the end would simply see a list of partial products of the computational method which have come direct from the osts. It will not care whether those parts came from a different breakdown of storage than it anticpated in the chunking (though of course the storage will need to do the necessary mapping to generate the partial sums).
 
-In the longer term, where we expect that we will have to pass a decompression method down through the `_decode_chunk` interface, it _will_ be necessary for the computational storage to respect the `_decode_chunk` interface server-side. This is of course what is required with S3, where we effectively need an S3 proxy to do the work.
+In the longer term, where we expect that we will have to pass a decompression method down through the `_decode_chunk` interface, it _will_ be necessary for the computational storage to respect the `_decode_chunk` interface server-side. This is of course what is required with S3, where we effectively need an S3 proxy to do the work. It might be that this is also required in some implementations of POSIX storage systems if they wish to implement computational storage OR they will need to somehow respect block contiguous placement in some way.