You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The currently generated csv files from the StepLogCallback class are very irregular and difficult to parse, because some entries of the observation column are automatically converted to strings by pandas during csv export. @clemens-fricke came up with a way of parsing the csv file back into a pandas data frame, which was successfully tested on artificially generated csv files, however did not work on a recent csv file that I produced during an actual experiment run, see below.
# Function to parse lists (for actions, observations, and rewards)defparse_lists(value, make2d=False):
ifisinstance(value, str):
try:
# first remove bracketsvalue=value.strip("[").strip("]")
# remove trailing and leading whitespacesvalue=value.strip()
# the remaining string should contain the values, separated by spacesentries=value.split()
# convert entries to floatparsed_values=np.array([float(number) fornumberinentries], dtype=float)
# Reshape to keep the 2D structure if necessaryreturnparsed_values.reshape(1, -1) ifmake2delseparsed_valuesexceptValueError:
print(f"ERROR: Cannot convert value {value} to list of floats.")
returnnp.nanreturnvalue# Load CSV filedf=pd.read_csv(step_log_path, dtype=str, index_col="timesteps") # Read everything as strings initially# Convert numeric columnsdf["episodes"] =df["episodes"].astype(int)
# Apply parsing functionsdf["observations"] =df["observations"].apply(lambdax: parse_lists(x, True))
df["actions"] =df["actions"].apply(parse_lists)
df["rewards"] =df["rewards"].apply(parse_lists)
This parses the provided step_log.csv file correctly, however is not straightforward.
Alternative solution
The problem that we're currently facing is purely related to the csv export. Instead of providing a utility function based on regular expressions that we might have to update if we encounter new edge cases in the future that we have not yet observed in our current experiments, we could modify the export of the data and use line-based json format as an alternative (line based to allow for appending):
Changing to this type of export would have the benefit of retaining the original data structure and it could be parsed with a single line without weird conversions during parsing:
UPDATE: I slightly modified (and simplified) the solution for parsing the generated csv above.
Turns out that sometimes using your brain is actually a good alternative to continuously (and desperately) prompting ChatGPT when the latter is not able to find a solution for all the edge cases. And turns out that the solution is often much simpler than you think 🙃
Problem
The currently generated csv files from the
StepLogCallback
class are very irregular and difficult to parse, because some entries of theobservation
column are automatically converted to strings by pandas during csv export. @clemens-fricke came up with a way of parsing the csv file back into a pandas data frame, which was successfully tested on artificially generated csv files, however did not work on a recent csv file that I produced during an actual experiment run, see below.📎 step_log.csv
Solution suggested by @clemens-fricke
Alternative solution for parsing
This parses the provided
step_log.csv
file correctly, however is not straightforward.Alternative solution
The problem that we're currently facing is purely related to the csv export. Instead of providing a utility function based on regular expressions that we might have to update if we encounter new edge cases in the future that we have not yet observed in our current experiments, we could modify the export of the data and use line-based json format as an alternative (line based to allow for appending):
Changing to this type of export would have the benefit of retaining the original data structure and it could be parsed with a single line without weird conversions during parsing:
Conclusion
I think I have a clear favourite solution here :D But I wanted to raise this issue first, I'm open for discussion and different opinions :)
The text was updated successfully, but these errors were encountered: