Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yet another attempt to solve TTC errors... #203

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open

Yet another attempt to solve TTC errors... #203

wants to merge 1 commit into from

Conversation

Frix-x
Copy link
Owner

@Frix-x Frix-x commented Feb 14, 2025

Summary by Sourcery

Refactor the accelerometer data saving mechanism to use a dedicated writer process and queue. This prevents blocking I/O operations and improves data integrity. Update the data format to JSON lines for improved handling and parsing. Modify the shaketune_process.py to accept filenames as input and load data accordingly. Update graph creation to use the new data loading mechanism and save graphs with consistent naming. Fix various bugs related to data handling and analysis, and improve error handling and reporting.

Bug Fixes:

  • Resolve "Timer too close" errors encountered during accelerometer data saving by implementing a dedicated writer process and queue to handle disk I/O operations asynchronously, preventing Klipper's main thread from being blocked during data saving, and ensuring measurements are written to disk before starting a new one, thus avoiding timing conflicts and potential data loss or corruption.
  • Fix potential race conditions and data corruption issues that could occur when saving accelerometer data to disk by using a dedicated writer process and queue, ensuring that only one process writes to the file at a time, and preventing concurrent access and potential data corruption or inconsistencies.
  • Fix potential issues with accelerometer data retrieval and graph generation by ensuring the Klipper's main thread is not blocked during data saving, allowing for timely retrieval of samples and preventing potential delays or errors in graph creation.
  • Fix potential issues with data loss or corruption when assembling measurement chunks by ensuring proper synchronization and data integrity during the merging process, preventing data loss or corruption due to concurrent access or incomplete writes.
  • Fix potential issues with data loading and processing by ensuring that the correct number of measurements is provided for each analysis tool, preventing errors and ensuring accurate results.
  • Fix potential issues with incorrect measurement names by validating the provided names and ensuring they adhere to the required format, preventing errors and ensuring data consistency.

Enhancements:

  • Improve the efficiency and performance of accelerometer data saving by using a dedicated writer process and queue, allowing asynchronous disk I/O operations and preventing Klipper's main thread from being blocked, thus improving overall responsiveness and performance.
  • Improve data integrity and reliability by using a dedicated writer process and queue, ensuring that measurements are written to disk in a consistent and synchronized manner, preventing data loss or corruption due to concurrent access or incomplete writes.
  • Simplify the data saving and loading process by using JSON serialization and a line-based format, improving code readability and maintainability, and reducing the complexity of data handling operations.
  • Improve error handling and reporting during data saving and loading operations, providing more informative error messages and facilitating debugging and troubleshooting.
  • Improve the accuracy and reliability of data analysis by ensuring that the correct number of measurements is provided for each tool, preventing errors and ensuring accurate results.
  • Improve the overall user experience by providing clearer error messages and instructions, and ensuring data consistency and reliability.

Copy link
Contributor

sourcery-ai bot commented Feb 14, 2025

Reviewer's Guide by Sourcery

This pull request refactors the way accelerometer data is saved and processed in ShakeTune. It introduces a dedicated writer process to handle disk writes, which prevents blocking I/O and improves performance. The changes also simplify the graph creation process and ensure that data is saved correctly.

Sequence diagram for saving accelerometer data

sequenceDiagram
  participant ST as ShakeTune
  participant MM as MeasurementsManager
  participant WP as WriterProcess
  participant Queue as WriterQueue
  participant Disk

  ST->>MM: add_measurement(name, samples)
  alt len(measurements) > chunk_size
    MM->>Queue: put(measurement)
    activate WP
    Queue->>WP: meas
    WP->>Disk: write(meas)
    deactivate WP
  end
  ST->>MM: save_stdata(filename)
  MM->>Queue: put(STOP_SENTINEL)
  WP->>Disk: flush()
  Disk-->>WP: done
  WP-->>MM: done
  MM-->>ST: done
Loading

Updated class diagram for MeasurementsManager

classDiagram
    class MeasurementsManager {
        -chunk_size: int
        -k_reactor
        -measurements: List~Measurement~
        -temp_file: Path
        -writer_queue: Queue
        -is_writing: Value
        -writer_process: Optional~Process~
        +__init__(chunk_size: int, k_reactor=None)
        +clear_measurements(keep_last: bool = False)
        +append_samples_to_current_measurement(additional_samples: SamplesList)
        +add_measurement(name: str, samples: SamplesList = None, timeout: float = 30)
        -_writer_loop(output_file: Path, write_queue: Queue, is_writing: Value)
        -_flush_chunk()
        +save_stdata(filename: Path, timeout: int = 30)
        +get_measurements() : List~Measurement~
        +load_from_stdata(filename: Path) : List~Measurement~
        +load_from_csvs(klipper_CSVs: List~Path~) : List~Measurement~
        +__del__()
    }
    class Measurement {
        name: str
        samples: List~Sample~
    }
    Measurement -- MeasurementsManager: contains
    note for MeasurementsManager "Manages accelerometer measurements, including writing to disk using a dedicated process."
Loading

File-Level Changes

Change Details Files
Replaced the previous chunk-based file saving mechanism with a single temporary file and a dedicated writer process to handle disk writes.
  • Introduced a dedicated writer process with a queue for managing write operations.
  • Utilized a single temporary file for storing accelerometer data.
  • Implemented JSON serialization for writing measurement objects to disk.
  • Used Zstandard compression for the data written to disk.
  • Added a sentinel value to signal the writer process to stop.
  • Implemented a timeout mechanism to prevent indefinite waiting for the writer process.
  • Removed the chunk-based file saving mechanism.
  • Removed the temporary directory creation and cleanup logic.
shaketune/helpers/accelerometer.py
Modified the ShakeTuneProcess class to accept a list of filenames instead of a MeasurementsManager object.
  • Updated the run method to accept a filename or a list of filenames.
  • Modified the _shaketune_process method to load measurements from the provided file(s).
  • Removed the measurements_manager parameter from the _shaketune_process method.
  • Added file existence checks and suffix validation.
  • Instantiated the MeasurementsManager inside the _shaketune_process method.
shaketune/shaketune_process.py
Updated the graph creators to save directly to the final file path and removed the intermediate .stdata file saving.
  • Added a define_output_target method to set the output file path.
  • Modified the _save_figure method to save the figure to the defined output path.
  • Removed the measurements_manager parameter from the _save_figure method.
  • Removed the logic for saving the raw data to a separate .stdata file.
  • Added a get_folder method to retrieve the output folder.
  • Removed the override_output_target method.
  • Removed the _graph_date attribute.
shaketune/graph_creators/graph_creator.py
Modified the commands to use the new data saving and graph creation flow.
  • Instantiated the MeasurementsManager with the Klipper reactor.
  • Removed calls to accelerometer.wait_for_samples().
  • Removed calls to measurements_manager.wait_for_data_transfers().
  • Added logic to define the output target for the graph creator.
  • Called measurements_manager.save_stdata() to save the data to disk.
  • Passed the filename to st_process.run().
shaketune/commands/axes_shaper_calibration.py
shaketune/commands/compare_belts_responses.py
shaketune/commands/create_vibrations_profile.py
shaketune/commands/axes_map_calibration.py
shaketune/commands/excitate_axis_at_freq.py
Updated the graph computations to handle cases where there are not enough measurements.
  • Added a check to ensure that there are 3 measurements available before computing the axes map.
  • Added a check to ensure that there are 2 measurements available before computing the belts graph.
shaketune/graph_creators/axes_map_graph_creator.py
shaketune/graph_creators/belts_graph_creator.py
Updated the CLI to use the new output target definition.
  • Called graph_creator.define_output_target() instead of graph_creator.override_output_target().
shaketune/cli.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Frix-x Frix-x added the bug Something isn't working label Feb 14, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Frix-x - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding a method to MeasurementsManager to check if the writer process is running.
  • The __del__ method in MeasurementsManager might not always be called, so consider a more explicit cleanup mechanism.
Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +55 to +64
def _writer_loop(self, output_file: Path, write_queue: Queue, is_writing: Value):
try:
with open(output_file, 'wb') as f:
cctx = ZstdCompressor(level=3)
with cctx.stream_writer(f) as compressor:
while True:
meas = write_queue.get()
if meas == STOP_SENTINEL:
break
with is_writing.get_lock():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Consider consolidating the _is_writing flag update within a single critical section.

Currently, the flag is set to True immediately before writing and then set to False immediately after. Wrapping both the JSON conversion and the compressor.write call in one lock acquisition may help ensure that the flag accurately reflects the entire duration of the write operation, reducing any potential race conditions.

Suggested implementation:

                        with is_writing.get_lock():
                            is_writing.value = True
                            line = (json.dumps(meas) + '\n').encode('utf-8')
                            compressor.write(line)
                            is_writing.value = False

This change assumes that the entire JSON conversion and writing operation should be protected by the lock. If the JSON conversion is time-consuming and you only need to mark that a write is in progress, consider whether additional performance implications need to be evaluated.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this, I just want to be able to see if a writing operation is in progress from the main thread

shaketune/helpers/accelerometer.py Show resolved Hide resolved
shaketune/graph_creators/graph_creator.py Show resolved Hide resolved
Comment on lines +37 to +46
def run(self, filenames: Union[Path, List[Path]]) -> None:
filelist = []

# Single .stdata or a legacy .csv file
if isinstance(filenames, Path):
if not filenames.suffix == '.stdata' and not filenames.suffix == '.csv':
filenames = filenames.with_suffix('.stdata')
if not filenames.exists():
raise FileNotFoundError(f'File {filenames} does not exist!')
filelist.append(filenames)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Clarify handling of different file types in run().

The method now accepts either a single file or a list of files and treats .stdata and .csv differently. It might be helpful to explicitly document (or enforce) that a homogeneous set of files is expected, or add error handling in case mixed file types are passed accidentally.

Suggested change
def run(self, filenames: Union[Path, List[Path]]) -> None:
filelist = []
# Single .stdata or a legacy .csv file
if isinstance(filenames, Path):
if not filenames.suffix == '.stdata' and not filenames.suffix == '.csv':
filenames = filenames.with_suffix('.stdata')
if not filenames.exists():
raise FileNotFoundError(f'File {filenames} does not exist!')
filelist.append(filenames)
def run(self, filenames: Union[Path, List[Path]]) -> None:
filelist = []
# Single file handling: .stdata or legacy .csv
if isinstance(filenames, Path):
if filenames.suffix not in ('.stdata', '.csv'):
filenames = filenames.with_suffix('.stdata')
if not filenames.exists():
raise FileNotFoundError(f"File {filenames} does not exist!")
filelist.append(filenames)
# List handling: enforce homogeneous file types
elif isinstance(filenames, list):
extensions = {f.suffix for f in filenames}
if len(extensions) > 1:
raise ValueError("Mixed file types are not allowed. Please supply files of a homogeneous type (.stdata or .csv).")
for f in filenames:
if f.suffix not in ('.stdata', '.csv'):
f = f.with_suffix('.stdata')
if not f.exists():
raise FileNotFoundError(f"File {f} does not exist!")
filelist.append(f)

@@ -36,130 +36,155 @@


class MeasurementsManager:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider extracting the asynchronous writing logic into a dedicated MeasurementWriter class to decouple writing and measurement management.

Consider extracting the asynchronous writing and state‐monitoring logic into a dedicated helper class (e.g., `MeasurementWriter`). That way, the `MeasurementsManager` can focus solely on measurement management while the writer class handles interprocess communication and buffering.

For example, you could create a new class:

```python
from multiprocessing import Process, Queue, Value
from pathlib import Path
import json
from zstandard import ZstdCompressor, FLUSH_FRAME
from ..helpers.console_output import ConsoleOutput

STOP_SENTINEL = 'STOP_SENTINEL'

class MeasurementWriter:
    def __init__(self, output_file: Path):
        self.output_file = output_file
        self.queue = Queue()
        self.is_writing = Value('b', False)
        self.process = Process(target=self._writer_loop, args=())
        self.process.start()

    def _writer_loop(self):
        try:
            with open(self.output_file, 'wb') as f:
                cctx = ZstdCompressor(level=3)
                with cctx.stream_writer(f) as compressor:
                    while True:
                        meas = self.queue.get()
                        if meas == STOP_SENTINEL:
                            break
                        with self.is_writing.get_lock():
                            self.is_writing.value = True
                        line = (json.dumps(meas) + '\n').encode('utf-8')
                        compressor.write(line)
                        with self.is_writing.get_lock():
                            self.is_writing.value = False
                    compressor.flush(FLUSH_FRAME)
        except Exception as e:
            ConsoleOutput.print(f'Error writing to file {self.output_file}: {e}')

    def enqueue(self, meas):
        self.queue.put(meas)

    def finish(self):
        self.queue.put(STOP_SENTINEL)
        self.process.join()

Then, in your MeasurementsManager, hand off writing responsibilities:

class MeasurementsManager:
    def __init__(self, chunk_size: int, output_file: Path, k_reactor=None):
        self._chunk_size = chunk_size
        self.measurements = []
        self._k_reactor = k_reactor
        self.writer = MeasurementWriter(output_file)

    def add_measurement(self, name: str, samples: SamplesList = None):
        samples = samples if samples is not None else []
        self.measurements.append({'name': name, 'samples': samples})
        if len(self.measurements) > self._chunk_size:
            self._flush_chunk()

    def _flush_chunk(self):
        if len(self.measurements) <= 1:
            return
        # Flush all measurements except the last one.
        flush_list = self.measurements[:-1]
        for meas in flush_list:
            self.writer.enqueue(meas)
        self.measurements = self.measurements[-1:]

    def save_stdata(self, final_filename: Path, timeout: int = 30):
        # Flush remaining measurements.
        for meas in self.measurements:
            self.writer.enqueue(meas)
        self.measurements = []
        # Signal writer to finish.
        self.writer.finish()
        # Now handle renaming or final file operations.
        try:
            if final_filename.exists():
                final_filename.unlink()
            # Rename temp file to final filename.
            # (Assumes self.writer.output_file is the temp file.)
            Path(self.writer.output_file).rename(final_filename)
        except Exception as e:
            ConsoleOutput.print(f'Error finalizing file {final_filename}: {e}')

Steps to reduce complexity:

  • Encapsulate Writing Logic: Shift all queue/process management from MeasurementsManager to MeasurementWriter.
  • Simplify Manager: Have MeasurementsManager only enqueue measurement chunks without inner process state checks.
  • Abstract Waiting Mechanism: If needed, move waiting logic (based on k_reactor) into the writer class or provide a helper method to check the writer’s status.

These changes keep all functionality intact while reducing nesting and interleaving of concerns within a single class.

@@ -19,6 +20,8 @@


def compare_belts_responses(gcmd, config, st_process: ShakeTuneProcess) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Low code quality found in compare_belts_responses - 16% (low-code-quality)


ExplanationThe quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

@@ -20,6 +21,8 @@


def create_vibrations_profile(gcmd, config, st_process: ShakeTuneProcess) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Low code quality found in create_vibrations_profile - 12% (low-code-quality)


ExplanationThe quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

@@ -17,6 +18,8 @@


def excitate_axis_at_freq(gcmd, config, st_process: ShakeTuneProcess) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Low code quality found in excitate_axis_at_freq - 15% (low-code-quality)


ExplanationThe quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

except Exception as e:
ConsoleOutput.print(f'Warning: unable to assemble chunks into {filename}: {e}')
# Add extension if not provided
if not filename.suffix == '.stdata':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Simplify logical expression using De Morgan identities (de-morgan)

Suggested change
if not filename.suffix == '.stdata':
if filename.suffix != '.stdata':


# Single .stdata or a legacy .csv file
if isinstance(filenames, Path):
if not filenames.suffix == '.stdata' and not filenames.suffix == '.csv':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

  • Simplify logical expression using De Morgan identities [×2] (de-morgan)
  • Replace multiple comparisons of same variable with in operator (merge-comparisons)
Suggested change
if not filenames.suffix == '.stdata' and not filenames.suffix == '.csv':
if filenames.suffix not in ['.stdata', '.csv']:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant