Skip to content

Commit

Permalink
Deployed f9bccc7 to 0.10 with MkDocs 1.5.3 and mike 1.1.2
Browse files Browse the repository at this point in the history
  • Loading branch information
ci-bot committed Jun 25, 2024
1 parent 8c75da7 commit 8be21a7
Show file tree
Hide file tree
Showing 6 changed files with 34 additions and 14 deletions.
22 changes: 16 additions & 6 deletions 0.10/configuration/preprocessing/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1528,8 +1528,8 @@
</li>

<li class="md-nav__item">
<a href="#sample-ratio" class="md-nav__link">
Sample Ratio
<a href="#sample-ratio-and-size" class="md-nav__link">
Sample Ratio and Size
</a>

</li>
Expand Down Expand Up @@ -3584,8 +3584,8 @@
</li>

<li class="md-nav__item">
<a href="#sample-ratio" class="md-nav__link">
Sample Ratio
<a href="#sample-ratio-and-size" class="md-nav__link">
Sample Ratio and Size
</a>

</li>
Expand Down Expand Up @@ -3768,17 +3768,27 @@ <h3 id="undersampling">Undersampling<a class="headerlink" href="#undersampling"
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">undersample_majority</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0.7</span>
</code></pre></div>
<h2 id="sample-ratio">Sample Ratio<a class="headerlink" href="#sample-ratio" title="Permanent link">&para;</a></h2>
<h2 id="sample-ratio-and-size">Sample Ratio and Size<a class="headerlink" href="#sample-ratio-and-size" title="Permanent link">&para;</a></h2>
<p>Sometimes users may want to train on a sample of their input training data (maybe
there's too much, and we only need 20%, or we want to try out ideas on a smaller
subset of our data). In order to achieve this, a user can specify a <code>sample_ratio</code>
subset of our data). In order to achieve this, a user can specify a <code>sample_ratio</code> or a <code>sample_size</code>
to indicate the ratio of the dataset to use for training.</p>
<p>By default, the sample ratio is 1.0, so if not specified, all the data will be
used for training. For example, if you only want to use 30% of my input data,
you could specify a config like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">sample_ratio</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0.3</span>
</code></pre></div>
<p>Furthermore, if you want to specify the exact number of samples to use for training,
you can use the <code>sample_size</code> parameter. For example, if you want to use 1000 samples for training,
you could specify a config like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">sample_size</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1000</span>
</code></pre></div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p><code>sample_size</code> can only be used when <code>sample_ratio</code> is 1.0, which is the default value.</p>
</div>
<h2 id="global-max-sequence-length">Global Max Sequence Length<a class="headerlink" href="#global-max-sequence-length" title="Permanent link">&para;</a></h2>
<p>There are <a href="https://www.youtube.com/watch?v=g68qlo9Izf0&amp;t=2685s">many factors at play</a>
when it comes to fine-tuning LLMs efficiently on a single GPU.</p>
Expand Down
2 changes: 1 addition & 1 deletion 0.10/search/search_index.json

Large diffs are not rendered by default.

Binary file modified 0.10/sitemap.xml.gz
Binary file not shown.
22 changes: 16 additions & 6 deletions latest/configuration/preprocessing/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1528,8 +1528,8 @@
</li>

<li class="md-nav__item">
<a href="#sample-ratio" class="md-nav__link">
Sample Ratio
<a href="#sample-ratio-and-size" class="md-nav__link">
Sample Ratio and Size
</a>

</li>
Expand Down Expand Up @@ -3584,8 +3584,8 @@
</li>

<li class="md-nav__item">
<a href="#sample-ratio" class="md-nav__link">
Sample Ratio
<a href="#sample-ratio-and-size" class="md-nav__link">
Sample Ratio and Size
</a>

</li>
Expand Down Expand Up @@ -3768,17 +3768,27 @@ <h3 id="undersampling">Undersampling<a class="headerlink" href="#undersampling"
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">undersample_majority</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0.7</span>
</code></pre></div>
<h2 id="sample-ratio">Sample Ratio<a class="headerlink" href="#sample-ratio" title="Permanent link">&para;</a></h2>
<h2 id="sample-ratio-and-size">Sample Ratio and Size<a class="headerlink" href="#sample-ratio-and-size" title="Permanent link">&para;</a></h2>
<p>Sometimes users may want to train on a sample of their input training data (maybe
there's too much, and we only need 20%, or we want to try out ideas on a smaller
subset of our data). In order to achieve this, a user can specify a <code>sample_ratio</code>
subset of our data). In order to achieve this, a user can specify a <code>sample_ratio</code> or a <code>sample_size</code>
to indicate the ratio of the dataset to use for training.</p>
<p>By default, the sample ratio is 1.0, so if not specified, all the data will be
used for training. For example, if you only want to use 30% of my input data,
you could specify a config like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">sample_ratio</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">0.3</span>
</code></pre></div>
<p>Furthermore, if you want to specify the exact number of samples to use for training,
you can use the <code>sample_size</code> parameter. For example, if you want to use 1000 samples for training,
you could specify a config like this:</p>
<div class="highlight"><pre><span></span><code><span class="nt">preprocessing</span><span class="p">:</span>
<span class="w"> </span><span class="nt">sample_size</span><span class="p">:</span><span class="w"> </span><span class="l l-Scalar l-Scalar-Plain">1000</span>
</code></pre></div>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p><code>sample_size</code> can only be used when <code>sample_ratio</code> is 1.0, which is the default value.</p>
</div>
<h2 id="global-max-sequence-length">Global Max Sequence Length<a class="headerlink" href="#global-max-sequence-length" title="Permanent link">&para;</a></h2>
<p>There are <a href="https://www.youtube.com/watch?v=g68qlo9Izf0&amp;t=2685s">many factors at play</a>
when it comes to fine-tuning LLMs efficiently on a single GPU.</p>
Expand Down
2 changes: 1 addition & 1 deletion latest/search/search_index.json

Large diffs are not rendered by default.

Binary file modified latest/sitemap.xml.gz
Binary file not shown.

0 comments on commit 8be21a7

Please sign in to comment.