huggingface · sergiopaniego · Aug 29, 2024 · Aug 29, 2024 · Aug 29, 2024 · Aug 29, 2024
diff --git a/chapters/de/chapter4/3.mdx b/chapters/de/chapter4/3.mdx
@@ -445,10 +445,10 @@ ls
 
 {#if fw === 'pt'}
 ```bash
-config.json  pytorch_model.bin  README.md  sentencepiece.bpe.model  special_tokens_map.json tokenizer_config.json  tokenizer.json
+added_tokens.json config.json model.safetensors sentencepiece.bpe.model special_tokens_map.json tokenizer_config.json tokenizer.json
 ```
 
-Wenn du dir die Dateigrößen anschaust (z.B. mit `ls -lh`), solltest du sehen, dass die Modell-Statedict Datei (*pytorch_model.bin*) der einzige Ausreißer ist mit über 400 MB.
+Wenn du dir die Dateigrößen anschaust (z.B. mit `ls -lh`), solltest du sehen, dass die Modell-Statedict Datei (*model.safetensors *) der einzige Ausreißer ist mit über 400 MB.
 
 {:else}
 ```bash
@@ -460,7 +460,7 @@ Wenn du dir die Dateigrößen anschaust (z.B. mit `ls -lh`), solltest du sehen,
 {/if}
 
 <Tip>
-✏️  Wenn ein Repository mittels der Webinterface kreiert wird, wird die *.gitattributes* Datei automatisch gesetzt, um bestimmte Dateiendungen wie *.bin* und *.h5* als große Dateien zu betrachten, sodass git-lfs sie tracken kann, ohne dass du weiteres konfigurieren musst.
+✏️  Wenn ein Repository mittels der Webinterface kreiert wird, wird die *.gitattributes* Datei automatisch gesetzt, um bestimmte Dateiendungen wie *.safetensors* und *.h5* als große Dateien zu betrachten, sodass git-lfs sie tracken kann, ohne dass du weiteres konfigurieren musst.
 </Tip> 
 
 Nun können wir weitermachen und so arbeiten wie wir es mit normalen Git Repositories machen. Wir können die Dateien stagen mit dem Git-Befehl `git add`:
@@ -483,12 +483,13 @@ Your branch is up to date with 'origin/main'.
 Changes to be committed:
   (use "git restore --staged <file>..." to unstage)
   modified:   .gitattributes
+	new file:   added_tokens.json
 	new file:   config.json
-	new file:   pytorch_model.bin
+	new file:   model.safetensors
 	new file:   sentencepiece.bpe.model
 	new file:   special_tokens_map.json
-	new file:   tokenizer.json
 	new file:   tokenizer_config.json
+	new file:   tokenizer.json
 ```
 {:else}
 ```bash
@@ -521,12 +522,13 @@ Objects to be pushed to origin/main:
 
 Objects to be committed:
 
-	config.json (Git: bc20ff2)
-	pytorch_model.bin (LFS: 35686c2)
+	added_tokens.json (Git: 43734cd)
+	config.json (Git: acfd093)
+	model.safetensors (LFS: 2785d2e)
 	sentencepiece.bpe.model (LFS: 988bc5a)
-	special_tokens_map.json (Git: cb23931)
-	tokenizer.json (Git: 851ff3e)
-	tokenizer_config.json (Git: f0f7783)
+	special_tokens_map.json (Git: b547935)
+	tokenizer.json (Git: 18d0f7a)
+	tokenizer_config.json (Git: c49982e)
 
 Objects not staged for commit:
 
@@ -567,11 +569,11 @@ git commit -m "First model version"
 
 {#if fw === 'pt'}
 ```bash
-[main b08aab1] First model version
- 7 files changed, 29027 insertions(+)
-  6 files changed, 36 insertions(+)
+[main c2ec5c9] First model version
+ 7 files changed, 128351 insertions(+)
+ create mode 100644 added_tokens.json
  create mode 100644 config.json
- create mode 100644 pytorch_model.bin
+ create mode 100644 model.safetensors
  create mode 100644 sentencepiece.bpe.model
  create mode 100644 special_tokens_map.json
  create mode 100644 tokenizer.json
@@ -597,15 +599,15 @@ git push
 ```
 
 ```bash
-Uploading LFS objects: 100% (1/1), 433 MB | 1.3 MB/s, done.
-Enumerating objects: 11, done.
-Counting objects: 100% (11/11), done.
-Delta compression using up to 12 threads
-Compressing objects: 100% (9/9), done.
-Writing objects: 100% (9/9), 288.27 KiB | 6.27 MiB/s, done.
-Total 9 (delta 1), reused 0 (delta 0), pack-reused 0
+Uploading LFS objects: 100% (2/2), 444 MB | 86 MB/s, done.
+Enumerating objects: 10, done.
+Counting objects: 100% (10/10), done.
+Delta compression using up to 2 threads
+Compressing objects: 100% (8/8), done.
+Writing objects: 100% (9/9), 592.02 KiB | 6.30 MiB/s, done.
+Total 9 (delta 0), reused 0 (delta 0), pack-reused 0
 To https://huggingface.co/lysandre/dummy
-   891b41d..b08aab1  main -> main
+   70fd9db..c2ec5c9  main -> main
 ```
 
 {#if fw === 'pt'}

diff --git a/chapters/en/chapter4/3.mdx b/chapters/en/chapter4/3.mdx
@@ -451,10 +451,10 @@ ls
 
 {#if fw === 'pt'}
 ```bash
-config.json  pytorch_model.bin  README.md  sentencepiece.bpe.model  special_tokens_map.json tokenizer_config.json  tokenizer.json
+added_tokens.json config.json model.safetensors sentencepiece.bpe.model special_tokens_map.json tokenizer_config.json tokenizer.json
 ```
 
-If you look at the file sizes (for example, with `ls -lh`), you should see that the model state dict file (*pytorch_model.bin*) is the only outlier, at more than 400 MB.
+If you look at the file sizes (for example, with `ls -lh`), you should see that the model state dict file (*model.safetensors *) is the only outlier, at more than 400 MB.
 
 {:else}
 ```bash
@@ -466,7 +466,7 @@ If you look at the file sizes (for example, with `ls -lh`), you should see that
 {/if}
 
 <Tip>
-✏️ When creating the repository from the web interface, the *.gitattributes* file is automatically set up to consider files with certain extensions, such as *.bin* and *.h5*, as large files, and git-lfs will track them with no necessary setup on your side.
+✏️ When creating the repository from the web interface, the *.gitattributes* file is automatically set up to consider files with certain extensions, such as *.safetensors* and *.h5*, as large files, and git-lfs will track them with no necessary setup on your side.
 </Tip> 
 
 We can now go ahead and proceed like we would usually do with traditional Git repositories. We can add all the files to Git's staging environment using the `git add` command:
@@ -489,12 +489,13 @@ Your branch is up to date with 'origin/main'.
 Changes to be committed:
   (use "git restore --staged <file>..." to unstage)
   modified:   .gitattributes
+	new file:   added_tokens.json
 	new file:   config.json
-	new file:   pytorch_model.bin
+	new file:   model.safetensors
 	new file:   sentencepiece.bpe.model
 	new file:   special_tokens_map.json
-	new file:   tokenizer.json
 	new file:   tokenizer_config.json
+	new file:   tokenizer.json
 ```
 {:else}
 ```bash
@@ -527,12 +528,13 @@ Objects to be pushed to origin/main:
 
 Objects to be committed:
 
-	config.json (Git: bc20ff2)
-	pytorch_model.bin (LFS: 35686c2)
+	added_tokens.json (Git: 43734cd)
+	config.json (Git: acfd093)
+	model.safetensors (LFS: 2785d2e)
 	sentencepiece.bpe.model (LFS: 988bc5a)
-	special_tokens_map.json (Git: cb23931)
-	tokenizer.json (Git: 851ff3e)
-	tokenizer_config.json (Git: f0f7783)
+	special_tokens_map.json (Git: b547935)
+	tokenizer.json (Git: 18d0f7a)
+	tokenizer_config.json (Git: c49982e)
 
 Objects not staged for commit:
 
@@ -573,11 +575,11 @@ git commit -m "First model version"
 
 {#if fw === 'pt'}
 ```bash
-[main b08aab1] First model version
- 7 files changed, 29027 insertions(+)
-  6 files changed, 36 insertions(+)
+[main c2ec5c9] First model version
+ 7 files changed, 128351 insertions(+)
+ create mode 100644 added_tokens.json
  create mode 100644 config.json
- create mode 100644 pytorch_model.bin
+ create mode 100644 model.safetensors
  create mode 100644 sentencepiece.bpe.model
  create mode 100644 special_tokens_map.json
  create mode 100644 tokenizer.json
@@ -603,15 +605,15 @@ git push
 ```
 
 ```bash
-Uploading LFS objects: 100% (1/1), 433 MB | 1.3 MB/s, done.
-Enumerating objects: 11, done.
-Counting objects: 100% (11/11), done.
-Delta compression using up to 12 threads
-Compressing objects: 100% (9/9), done.
-Writing objects: 100% (9/9), 288.27 KiB | 6.27 MiB/s, done.
-Total 9 (delta 1), reused 0 (delta 0), pack-reused 0
+Uploading LFS objects: 100% (2/2), 444 MB | 86 MB/s, done.
+Enumerating objects: 10, done.
+Counting objects: 100% (10/10), done.
+Delta compression using up to 2 threads
+Compressing objects: 100% (8/8), done.
+Writing objects: 100% (9/9), 592.02 KiB | 6.30 MiB/s, done.
+Total 9 (delta 0), reused 0 (delta 0), pack-reused 0
 To https://huggingface.co/lysandre/dummy
-   891b41d..b08aab1  main -> main
+   70fd9db..c2ec5c9  main -> main
 ```
 
 {#if fw === 'pt'}

diff --git a/chapters/fr/chapter4/3.mdx b/chapters/fr/chapter4/3.mdx
@@ -450,10 +450,10 @@ ls
 
 {#if fw === 'pt'}
 ```bash
-config.json  pytorch_model.bin  README.md  sentencepiece.bpe.model  special_tokens_map.json tokenizer_config.json  tokenizer.json
+added_tokens.json config.json model.safetensors sentencepiece.bpe.model special_tokens_map.json tokenizer_config.json tokenizer.json
 ```
 
-Si vous regardez la taille des fichiers (par exemple, avec `ls -lh`), vous devriez voir que le fichier d'état du modèle (*pytorch_model.bin*) est la seule exception, avec plus de 400 Mo.
+Si vous regardez la taille des fichiers (par exemple, avec `ls -lh`), vous devriez voir que le fichier d'état du modèle (*model.safetensors *) est la seule exception, avec plus de 400 Mo.
 
 {:else}
 ```bash
@@ -488,12 +488,13 @@ Your branch is up to date with 'origin/main'.
 Changes to be committed:
   (use "git restore --staged <file>..." to unstage)
   modified:   .gitattributes
+	new file:   added_tokens.json
 	new file:   config.json
-	new file:   pytorch_model.bin
+	new file:   model.safetensors
 	new file:   sentencepiece.bpe.model
 	new file:   special_tokens_map.json
-	new file:   tokenizer.json
 	new file:   tokenizer_config.json
+	new file:   tokenizer.json
 ```
 {:else}
 ```bash
@@ -526,12 +527,13 @@ Objects to be pushed to origin/main:
 
 Objects to be committed:
 
-	config.json (Git: bc20ff2)
-	pytorch_model.bin (LFS: 35686c2)
+	added_tokens.json (Git: 43734cd)
+	config.json (Git: acfd093)
+	model.safetensors (LFS: 2785d2e)
 	sentencepiece.bpe.model (LFS: 988bc5a)
-	special_tokens_map.json (Git: cb23931)
-	tokenizer.json (Git: 851ff3e)
-	tokenizer_config.json (Git: f0f7783)
+	special_tokens_map.json (Git: b547935)
+	tokenizer.json (Git: 18d0f7a)
+	tokenizer_config.json (Git: c49982e)
 
 Objects not staged for commit:
 
@@ -572,11 +574,11 @@ git commit -m "First model version"
 
 {#if fw === 'pt'}
 ```bash
-[main b08aab1] First model version
- 7 files changed, 29027 insertions(+)
-  6 files changed, 36 insertions(+)
+[main c2ec5c9] First model version
+ 7 files changed, 128351 insertions(+)
+ create mode 100644 added_tokens.json
  create mode 100644 config.json
- create mode 100644 pytorch_model.bin
+ create mode 100644 model.safetensors
  create mode 100644 sentencepiece.bpe.model
  create mode 100644 special_tokens_map.json
  create mode 100644 tokenizer.json
@@ -602,15 +604,15 @@ git push
 ```
 
 ```bash
-Uploading LFS objects: 100% (1/1), 433 MB | 1.3 MB/s, done.
-Enumerating objects: 11, done.
-Counting objects: 100% (11/11), done.
-Delta compression using up to 12 threads
-Compressing objects: 100% (9/9), done.
-Writing objects: 100% (9/9), 288.27 KiB | 6.27 MiB/s, done.
-Total 9 (delta 1), reused 0 (delta 0), pack-reused 0
+Uploading LFS objects: 100% (2/2), 444 MB | 86 MB/s, done.
+Enumerating objects: 10, done.
+Counting objects: 100% (10/10), done.
+Delta compression using up to 2 threads
+Compressing objects: 100% (8/8), done.
+Writing objects: 100% (9/9), 592.02 KiB | 6.30 MiB/s, done.
+Total 9 (delta 0), reused 0 (delta 0), pack-reused 0
 To https://huggingface.co/lysandre/dummy
-   891b41d..b08aab1  main -> main
+   70fd9db..c2ec5c9  main -> main
 ```
 
 {#if fw === 'pt'}

diff --git a/chapters/it/chapter4/3.mdx b/chapters/it/chapter4/3.mdx
@@ -452,10 +452,10 @@ ls
 
 {#if fw === 'pt'}
 ```bash
-config.json  pytorch_model.bin  README.md  sentencepiece.bpe.model  special_tokens_map.json tokenizer_config.json  tokenizer.json
+added_tokens.json config.json model.safetensors sentencepiece.bpe.model special_tokens_map.json tokenizer_config.json tokenizer.json
 ```
 
-Guardando le dimensioni dei file (ad esempio con `ls -lh`), possiamo vedere che il file contenente lo stato del modello (model state dict file) (*pytorch_model.bin*) è l'unico file anomalo, occupando più di 400 MB.
+Guardando le dimensioni dei file (ad esempio con `ls -lh`), possiamo vedere che il file contenente lo stato del modello (model state dict file) (*model.safetensors *) è l'unico file anomalo, occupando più di 400 MB.
 
 {:else}
 ```bash
@@ -467,8 +467,8 @@ Guardando le dimensioni dei file (ad esempio con `ls -lh`), possiamo vedere che
 {/if}
 
 <Tip>
-✏️ When creating the repository from the web interface, the *.gitattributes* file is automatically set up to consider files with certain extensions, such as *.bin* and *.h5*, as large files, and git-lfs will track them with no necessary setup on your side.
-✏️ Creando il reposiotry dall'interfaccia web, il file *.gitattributes*  viene automaticamente configurato per considerare file con alcune estensioni, come *.bin* e *.h5*, come file grandi, e git-lfs li traccerà senza necessità di configurazione da parte dell'utente.
+✏️ When creating the repository from the web interface, the *.gitattributes* file is automatically set up to consider files with certain extensions, such as *.safetensors* and *.h5*, as large files, and git-lfs will track them with no necessary setup on your side.
+✏️ Creando il reposiotry dall'interfaccia web, il file *.gitattributes*  viene automaticamente configurato per considerare file con alcune estensioni, come *.safetensors* e *.h5*, come file grandi, e git-lfs li traccerà senza necessità di configurazione da parte dell'utente.
 </Tip> 
 
 Possiamo quindi procedere come faremo per un repository Git tradizionale. Possiamo aggiungere tutti i file all'ambiente di staging di Git con il comando `git add`:
@@ -491,12 +491,13 @@ Your branch is up to date with 'origin/main'.
 Changes to be committed:
   (use "git restore --staged <file>..." to unstage)
   modified:   .gitattributes
+	new file:   added_tokens.json
 	new file:   config.json
-	new file:   pytorch_model.bin
+	new file:   model.safetensors
 	new file:   sentencepiece.bpe.model
 	new file:   special_tokens_map.json
-	new file:   tokenizer.json
 	new file:   tokenizer_config.json
+	new file:   tokenizer.json
 ```
 {:else}
 ```bash
@@ -529,12 +530,13 @@ Objects to be pushed to origin/main:
 
 Objects to be committed:
 
-	config.json (Git: bc20ff2)
-	pytorch_model.bin (LFS: 35686c2)
+	added_tokens.json (Git: 43734cd)
+	config.json (Git: acfd093)
+	model.safetensors (LFS: 2785d2e)
 	sentencepiece.bpe.model (LFS: 988bc5a)
-	special_tokens_map.json (Git: cb23931)
-	tokenizer.json (Git: 851ff3e)
-	tokenizer_config.json (Git: f0f7783)
+	special_tokens_map.json (Git: b547935)
+	tokenizer.json (Git: 18d0f7a)
+	tokenizer_config.json (Git: c49982e)
 
 Objects not staged for commit:
 
@@ -575,11 +577,11 @@ git commit -m "First model version"
 
 {#if fw === 'pt'}
 ```bash
-[main b08aab1] First model version
- 7 files changed, 29027 insertions(+)
-  6 files changed, 36 insertions(+)
+[main c2ec5c9] First model version
+ 7 files changed, 128351 insertions(+)
+ create mode 100644 added_tokens.json
  create mode 100644 config.json
- create mode 100644 pytorch_model.bin
+ create mode 100644 model.safetensors
  create mode 100644 sentencepiece.bpe.model
  create mode 100644 special_tokens_map.json
  create mode 100644 tokenizer.json
@@ -605,15 +607,15 @@ git push
 ```
 
 ```bash
-Uploading LFS objects: 100% (1/1), 433 MB | 1.3 MB/s, done.
-Enumerating objects: 11, done.
-Counting objects: 100% (11/11), done.
-Delta compression using up to 12 threads
-Compressing objects: 100% (9/9), done.
-Writing objects: 100% (9/9), 288.27 KiB | 6.27 MiB/s, done.
-Total 9 (delta 1), reused 0 (delta 0), pack-reused 0
+Uploading LFS objects: 100% (2/2), 444 MB | 86 MB/s, done.
+Enumerating objects: 10, done.
+Counting objects: 100% (10/10), done.
+Delta compression using up to 2 threads
+Compressing objects: 100% (8/8), done.
+Writing objects: 100% (9/9), 592.02 KiB | 6.30 MiB/s, done.
+Total 9 (delta 0), reused 0 (delta 0), pack-reused 0
 To https://huggingface.co/lysandre/dummy
-   891b41d..b08aab1  main -> main
+   70fd9db..c2ec5c9  main -> main
 ```
 
 {#if fw === 'pt'}