[BUG] Baichuan13B model init failed #323

bingo787 · 2024-01-29T03:46:10Z

错误日志：

2024-01-29T03:38:26.506905Z INFO text_generation_launcher: Args { model_id: "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat", revision: None, sharded: None, num_shard: Some(2), quantize: None, mode: None, trust_remote_code: true, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, max_batch_size: None, waiting_served_ratio: 1.2, max_batch_total_tokens: 10000, max_waiting_tokens: 20, port: 3000, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, env: false, tokenizer_name: "vicuna-baichuan-13b-chat" }
2024-01-29T03:38:26.506933Z INFO text_generation_launcher: Sharding model on 2 processes
2024-01-29T03:38:26.507074Z INFO text_generation_launcher: Starting download process.
2024-01-29T03:38:29.059833Z INFO download: text_generation_launcher: Files are already present on the host. Skipping download.

2024-01-29T03:38:29.636361Z INFO text_generation_launcher: Successfully downloaded weights.
2024-01-29T03:38:29.636407Z WARN text_generation_launcher: trust_remote_code is set. Trusting that model /home/mnt/sdc-share/models/vicuna-baichuan-13b-chat do not contain malicious code.
2024-01-29T03:38:29.636416Z WARN text_generation_launcher: Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
2024-01-29T03:38:29.636599Z INFO text_generation_launcher: Starting shard 0
2024-01-29T03:38:29.636651Z INFO text_generation_launcher: Starting shard 1
2024-01-29T03:38:35.117327Z ERROR shard-manager: text_generation_launcher: {'tp_rank': 1, 'world_size': 2, 'weight_dir': '/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat', 'max_total_token_num': 10000, 'weight_dict': None, 'finetune_config': BaichuanConfig {
"_from_model_config": true,
"_name_or_path": "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat",
"architectures": [
"BaichuanForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_baichuan.BaichuanConfig",
"AutoModelForCausalLM": "modeling_baichuan.BaichuanForCausalLM"
},
"bos_token_id": 1,
"eos_token_id": 2,
"gradient_checkpointing": [
false
],
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13696,
"model_max_length": 4096,
"model_type": "baichuan",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.33.1",
"use_cache": true,
"vocab_size": 64000
}
, 'max_req_num': 1000, 'mode': [''], 'max_seq_length': 2048}
rank=1
2024-01-29T03:38:35.117515Z ERROR shard-manager: text_generation_launcher: {'tp_rank': 0, 'world_size': 2, 'weight_dir': '/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat', 'max_total_token_num': 10000, 'weight_dict': None, 'finetune_config': BaichuanConfig {
"_from_model_config": true,
"_name_or_path": "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat",
"architectures": [
"BaichuanForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_baichuan.BaichuanConfig",
"AutoModelForCausalLM": "modeling_baichuan.BaichuanForCausalLM"
},
"bos_token_id": 1,
"eos_token_id": 2,
"gradient_checkpointing": [
false
],
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13696,
"model_max_length": 4096,
"model_type": "baichuan",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.33.1",
"use_cache": true,
"vocab_size": 64000
}
, 'max_req_num': 1000, 'mode': [''], 'max_seq_length': 2048}
rank=0
2024-01-29T03:38:35.122228Z ERROR shard-manager: text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 70, in serve
server.serve(model_id, revision, sharded, quantize, trust_remote_code, mode.split(","), uds_path)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 172, in serve
asyncio.run(serve_inner(model_id, revision, sharded, quantize, trust_remote_code))
File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 634, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
self._run_once()
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
handle._run()
File "/opt/conda/lib/python3.9/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 139, in serve_inner
model = get_model(model_id, revision, sharded, quantize, trust_remote_code, mode)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/init.py", line 323, in get_model
return LightLLM(
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/light_llm.py", line 404, in init
self.model = Baichuan13bTpPartModel(model_kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/baichuan13b/model.py", line 21, in init
super().init(kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/llama/model.py", line 35, in init
super().init(kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/common/basemodel/basemodel.py", line 49, in init
self._verify_params()
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/baichuan13b/model.py", line 26, in _verify_params
assert self.mode == [ ], "baichuan13b only support normal mode"
AssertionError: baichuan13b only support normal mode

The text was updated successfully, but these errors were encountered:

bingo787 · 2024-01-29T03:48:11Z

错误原因在于下面这行代码：
https://github.com/ModelTC/lightllm/blob/main/lightllm/models/baichuan13b/model.py#L25

TGI上层传入的所谓空可能是['']

hiworldwzj · 2024-01-29T06:29:20Z

@bingo787 你可以修改tgi传入时候传入的参数来适配一下。

hiworldwzj · 2024-01-29T06:32:31Z

@bingo787 #277 还有你提的那个加载权重的pr，感觉逻辑上不能有效的兼容历史的情况。是不是可以修改实现，在有weight_dict 的时候就从 weight_dict 中加载，但是不要影响主要的处理流程。

bingo787 · 2024-01-29T08:48:34Z

@bingo787 #277 还有你提的那个加载权重的pr，感觉逻辑上不能有效的兼容历史的情况。是不是可以修改实现，在有weight_dict 的时候就从 weight_dict 中加载，但是不要影响主要的处理流程。

不影响主流程呀。

hiworldwzj · 2024-01-29T09:13:50Z

@bingo787 verify_load 的调用逻辑被改变了。有些模型会在verify_load的里面去校验和补充参数的。

bingo787 · 2024-01-30T03:19:45Z

在 #277 下面讨论吧

bingo787 · 2024-01-30T03:21:11Z

TGI上层适配解决

bingo787 added the bug Something isn't working label Jan 29, 2024

bingo787 changed the title ~~[BUG] Baichuan model init failed~~ [BUG] Baichuan13B model init failed Jan 29, 2024

bingo787 closed this as completed Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Baichuan13B model init failed #323

[BUG] Baichuan13B model init failed #323

bingo787 commented Jan 29, 2024

bingo787 commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

bingo787 commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

bingo787 commented Jan 30, 2024

bingo787 commented Jan 30, 2024

[BUG] Baichuan13B model init failed #323

[BUG] Baichuan13B model init failed #323

Comments

bingo787 commented Jan 29, 2024

bingo787 commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

bingo787 commented Jan 29, 2024

hiworldwzj commented Jan 29, 2024

bingo787 commented Jan 30, 2024

bingo787 commented Jan 30, 2024