You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-Llama3-V-2_5/README.md
+23-9
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ In this directory, you will find examples on how you could apply IPEX-LLM INT4 o
5
5
To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information.
6
6
7
7
## Example: Predict Tokens using `chat()` API
8
-
In the example [generate.py](./generate.py), we show a basic use case for a MiniCPM-Llama3-V-2_5 model to predict the next N tokens using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
8
+
In the example [chat.py](./chat.py), we show a basic use case for a MiniCPM-Llama3-V-2_5 model to predict the next N tokens using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
9
9
### 1. Install
10
10
#### 1.1 Installation on Linux
11
11
We suggest using conda to manage environment:
@@ -106,28 +106,42 @@ set SYCL_CACHE_PERSISTENT=1
106
106
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
107
107
### 4. Running examples
108
108
109
-
```
110
-
python ./generate.py --prompt 'What is in the image?'
111
-
```
109
+
- chat without streaming mode:
110
+
```
111
+
python ./chat.py --prompt 'What is in the image?'
112
+
```
113
+
- chat in streaming mode:
114
+
```
115
+
python ./chat.py --prompt 'What is in the image?' --stream
116
+
```
112
117
113
118
Arguments info:
114
119
-`--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the MiniCPM-Llama3-V-2_5 (e.g. `openbmb/MiniCPM-Llama3-V-2_5`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-Llama3-V-2_5'`.
115
120
-`--image-url-or-path IMAGE_URL_OR_PATH`: argument defining the image to be infered. It is default to be `'http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg'`.
116
121
-`--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is in the image?'`.
117
-
-`--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
The image features a young child holding a white teddy bear. The teddy bear is dressed in a pink outfit. The child appears to be outdoors, with a stone wall and some red flowers in the background.
The image features a young child holding a white teddy bear. The teddy bear is dressed in a pink dress with a ribbon on it. The child appears to be smiling and enjoying the moment.
Copy file name to clipboardexpand all lines: python/llm/example/GPU/HuggingFace/Multimodal/MiniCPM-V-2/README.md
+22-9
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ In this directory, you will find examples on how you could apply IPEX-LLM INT4 o
5
5
To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information.
6
6
7
7
## Example: Predict Tokens using `chat()` API
8
-
In the example [generate.py](./generate.py), we show a basic use case for a MiniCPM-V-2 model to predict the next N tokens using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
8
+
In the example [chat.py](./chat.py), we show a basic use case for a MiniCPM-V-2 model to predict the next N tokens using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
9
9
### 1. Install
10
10
#### 1.1 Installation on Linux
11
11
We suggest using conda to manage environment:
@@ -106,28 +106,41 @@ set SYCL_CACHE_PERSISTENT=1
106
106
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
107
107
### 4. Running examples
108
108
109
-
```
110
-
python ./generate.py --prompt 'What is in the image?'
111
-
```
109
+
- chat without streaming mode:
110
+
```
111
+
python ./chat.py --prompt 'What is in the image?'
112
+
```
113
+
- chat in streaming mode:
114
+
```
115
+
python ./chat.py --prompt 'What is in the image?' --stream
116
+
```
112
117
113
118
Arguments info:
114
119
-`--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the MiniCPM-V-2 (e.g. `openbmb/MiniCPM-V-2`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'openbmb/MiniCPM-V-2'`.
115
120
-`--image-url-or-path IMAGE_URL_OR_PATH`: argument defining the image to be infered. It is default to be `'http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg'`.
116
121
-`--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is in the image?'`.
117
-
-`--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
In the image, there is a young child holding a teddy bear. The teddy bear appears to be dressed in a pink tutu. The child is also wearing a red and white striped dress. The background of the image includes a stone wall and some red flowers.
In the image, there is a young child holding a teddy bear. The teddy bear is dressed in a pink tutu. The child is also wearing a red and white striped dress. The background of the image features a stone wall and some red flowers.
0 commit comments