Profiler error with tensorboard_trace_handler: UnicodeDecodeError: 'utf-8' codec can't decode byte [...]: invalid start / continuation byte #988

thethomasboyer · 2024-09-13T19:27:18Z

🐛 Describe the bug

The following code:

import torch
from torch.profiler import ProfilerActivity, profile, record_function, tensorboard_trace_handler

DEVICE = "cuda:1"


def main():
    t = torch.rand(10, 10).to(DEVICE)
    for _ in range(100):
        t = t @ t


trace_handler = tensorboard_trace_handler("pytorch_traces", use_gzip=True)
profiler = profile(
    activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
    profile_memory=True,
    with_stack=True,
    on_trace_ready=trace_handler,
)

# profile the main function
profiler.start()
main()
profiler.stop()

fails with:

Traceback (most recent call last):
  File "/import/bc_workspaces/biocomp/tboyer/sources/GaussianProxy/error_repro.py", line 25, in <module>
    profiler.stop()
  File "/import/bc_workspaces/biocomp/tboyer/.micromamba/stat2dyn/lib/python3.12/site-packages/torch/profiler/profiler.py", line 722, in stop
    self._transit_action(self.current_action, None)
  File "/import/bc_workspaces/biocomp/tboyer/.micromamba/stat2dyn/lib/python3.12/site-packages/torch/profiler/profiler.py", line 751, in _transit_action
    action()
  File "/import/bc_workspaces/biocomp/tboyer/.micromamba/stat2dyn/lib/python3.12/site-packages/torch/profiler/profiler.py", line 745, in _trace_ready
    self.on_trace_ready(self)
  File "/import/bc_workspaces/biocomp/tboyer/.micromamba/stat2dyn/lib/python3.12/site-packages/torch/profiler/profiler.py", line 444, in handler_fn
    prof.export_chrome_trace(os.path.join(dir_name, file_name))
  File "/import/bc_workspaces/biocomp/tboyer/.micromamba/stat2dyn/lib/python3.12/site-packages/torch/profiler/profiler.py", line 220, in export_chrome_trace
    fout.writelines(fin)
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9d in position 5237: invalid start byte

with varying bytes and positions ((0xf8, 5248), etc), and either start or continuation byte.

Versions

Environment information

PyTorch version: 2.4.0
Is debug build: False
CUDA used to build PyTorch: 12.4
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.12.5 | packaged by conda-forge | (main, Aug  8 2024, 18:36:51) [GCC 12.4.0] (64-bit runtime)
Python platform: Linux-5.8.0-63-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA L40S
GPU 1: NVIDIA L40S
GPU 2: NVIDIA L40S
GPU 3: NVIDIA L40S

Nvidia driver version: 550.54.14
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.7
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.7
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture :                          x86_64
Mode(s) opératoire(s) des processeurs : 32-bit, 64-bit
Boutisme :                              Little Endian
Address sizes:                          52 bits physical, 57 bits virtual
Processeur(s) :                         64
Liste de processeur(s) en ligne :       0-63
Thread(s) par cœur :                    1
Cœur(s) par socket :                    1
Socket(s) :                             64
Nœud(s) NUMA :                          1
Identifiant constructeur :              GenuineIntel
Famille de processeur :                 6
Modèle :                                143
Nom de modèle :                         Intel(R) Xeon(R) Gold 6426Y
Révision :                              8
Vitesse du processeur en MHz :          2500.000
BogoMIPS :                              5000.00
Virtualisation :                        VT-x
Constructeur d'hyperviseur :            KVM
Type de virtualisation :                complet
Cache L1d :                             2 MiB
Cache L1i :                             2 MiB
Cache L2 :                              256 MiB
Cache L3 :                              1 GiB
Nœud NUMA 0 de processeur(s) :          0-63
Vulnerability Itlb multihit:            Not affected
Vulnerability L1tf:                     Not affected
Vulnerability Mds:                      Not affected
Vulnerability Meltdown:                 Not affected
Vulnerability Spec store bypass:        Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:               Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:               Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds:                    Not affected
Vulnerability Tsx async abort:          Mitigation; TSX disabled
Drapaux :                               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 wbnoinvd arat avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid cldemote movdiri movdir64b fsrm md_clear arch_capabilities

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.4.0
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.19.0
[pip3] triton==3.0.0
[conda] Could not collect

cc @robieta @chaekit @aaronenyeshi @guotuofeng @guyang3532 @dzhulgakov @davidberard98 @briancoutinho @sraikund16 @sanrise

The text was updated successfully, but these errors were encountered:

thethomasboyer · 2024-09-16T11:50:24Z

Seems like with_stack=True is the culprit, also can't repro on a fresh colab but still reproducible on my install.

sraikund16 · 2024-09-16T16:10:57Z

All tensorboard issues should be in kineto. Transferring

sraikund16 transferred this issue from pytorch/pytorch Sep 16, 2024

sraikund16 added the plugin PyTorch Profiler TensorBoard Plugin related label Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiler error with tensorboard_trace_handler: UnicodeDecodeError: 'utf-8' codec can't decode byte [...]: invalid start / continuation byte #988

Profiler error with tensorboard_trace_handler: UnicodeDecodeError: 'utf-8' codec can't decode byte [...]: invalid start / continuation byte #988

thethomasboyer commented Sep 13, 2024 •

edited by pytorch-bot bot

Loading

thethomasboyer commented Sep 16, 2024

sraikund16 commented Sep 16, 2024

Profiler error with tensorboard_trace_handler: UnicodeDecodeError: 'utf-8' codec can't decode byte [...]: invalid start / continuation byte #988

Profiler error with tensorboard_trace_handler: UnicodeDecodeError: 'utf-8' codec can't decode byte [...]: invalid start / continuation byte #988

Comments

thethomasboyer commented Sep 13, 2024 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

thethomasboyer commented Sep 16, 2024

sraikund16 commented Sep 16, 2024

thethomasboyer commented Sep 13, 2024 •

edited by pytorch-bot bot

Loading