Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSVLogger is setting val_* metrics to nan despite no validation data being provided. #21025

Open
innat opened this issue Mar 13, 2025 · 2 comments

Comments

@innat
Copy link

innat commented Mar 13, 2025

Basically two issue.

import tensorflow as tf

import os
import numpy as np
import keras
from keras import layers
from keras import ops
keras.__version__ # 3.8.0

inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Preprocess the data (these are NumPy arrays)
x_train = x_train.reshape(60000, 784).astype("float32") / 255
x_test = x_test.reshape(10000, 784).astype("float32") / 255
y_train = y_train.astype("float32")
y_test = y_test.astype("float32")
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

model.compile(
    optimizer=keras.optimizers.RMSprop(),  # Optimizer
    # Loss function to minimize
    loss=keras.losses.SparseCategoricalCrossentropy(),
    # List of metrics to monitor
    metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

print("Fit model on training data")
csv_logger = keras.callbacks.CSVLogger('training.csv')

history = model.fit(
    x_train,
    y_train,
    batch_size=64,
    epochs=2,
    callbacks=[csv_logger]
)
import pandas as pd
history = pd.read_csv('training.csv')
history.head()

Image

  1. The recorded scores for training data in csv file are also wrong.
Fit model on training data
Epoch 1/2
782/782 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - loss: 0.5810 - sparse_categorical_accuracy: 0.8389
Epoch 2/2
782/782 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - loss: 0.1661 - sparse_categorical_accuracy: 0.9494

Training log says, for first epoch acc: 0.83 but in csv, it is .90. Also with the loss scores.

@dhantule
Copy link
Contributor

Hi @innat, Thanks for reporting this.

The mismatch in scores could be because the scores in the training log are updated after each batch.

history.history scores match the scores in the CSV, the loss and metrics you get from history.history are averages over epoch. I've provided validation data and run your code in this gist and it seems to work.

Copy link

github-actions bot commented Apr 1, 2025

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants