Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📝 Fixed typos & grammatical mistakes Chapter12/1.mdx #816

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions chapters/en/chapter12/1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ We will also explore [Open R1](https://github.com/huggingface/open-r1), a ground

In this chapter, we'll break down complex concepts into easy-to-understand pieces and show you how you can be part of this exciting project to make LLMs reason on complex problems.

LLMs have shown excellent performance on many generative tasks. However, up until recently they have struggled on complex problems that require reasoning. For example, they struggle to deal with puzzles or math problems that require multiple steps of reasoning.
LLMs have shown excellent performance on many generative tasks. However, up until recently they have struggled with complex problems that require reasoning. For example, they struggle to deal with puzzles or math problems that require multiple steps of reasoning.

Open R1 is a project that aims to make LLMs reason on complex problems. It does this by using reinforcement learning to encourage LLMs to 'think' and reason.

In simple terms, the model is train to generate thoughts as well as outputs, and to structure these thoughts and outputs so that they can be handled separately by the user.
In simple terms, the model is trained to generate thoughts as well as outputs, and to structure these thoughts and outputs so that they can be handled separately by the user.

Let's take a look at an example. A we gave ourself the task of solving the following problem, we might think like this:
Let's take a look at an example. If we gave ourselves the task of solving the following problem, we might think like this:

```sh
Problem: "I have 3 apples and 2 oranges. How many pieces of fruit do I have in total?"
Expand Down