Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Baichuan13B model init failed #323

Closed
bingo787 opened this issue Jan 29, 2024 · 7 comments
Closed

[BUG] Baichuan13B model init failed #323

bingo787 opened this issue Jan 29, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@bingo787
Copy link
Contributor

错误日志:

2024-01-29T03:38:26.506905Z INFO text_generation_launcher: Args { model_id: "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat", revision: None, sharded: None, num_shard: Some(2), quantize: None, mode: None, trust_remote_code: true, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, max_batch_size: None, waiting_served_ratio: 1.2, max_batch_total_tokens: 10000, max_waiting_tokens: 20, port: 3000, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, env: false, tokenizer_name: "vicuna-baichuan-13b-chat" }
2024-01-29T03:38:26.506933Z INFO text_generation_launcher: Sharding model on 2 processes
2024-01-29T03:38:26.507074Z INFO text_generation_launcher: Starting download process.
2024-01-29T03:38:29.059833Z INFO download: text_generation_launcher: Files are already present on the host. Skipping download.

2024-01-29T03:38:29.636361Z INFO text_generation_launcher: Successfully downloaded weights.
2024-01-29T03:38:29.636407Z WARN text_generation_launcher: trust_remote_code is set. Trusting that model /home/mnt/sdc-share/models/vicuna-baichuan-13b-chat do not contain malicious code.
2024-01-29T03:38:29.636416Z WARN text_generation_launcher: Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
2024-01-29T03:38:29.636599Z INFO text_generation_launcher: Starting shard 0
2024-01-29T03:38:29.636651Z INFO text_generation_launcher: Starting shard 1
2024-01-29T03:38:35.117327Z ERROR shard-manager: text_generation_launcher: {'tp_rank': 1, 'world_size': 2, 'weight_dir': '/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat', 'max_total_token_num': 10000, 'weight_dict': None, 'finetune_config': BaichuanConfig {
"_from_model_config": true,
"_name_or_path": "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat",
"architectures": [
"BaichuanForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_baichuan.BaichuanConfig",
"AutoModelForCausalLM": "modeling_baichuan.BaichuanForCausalLM"
},
"bos_token_id": 1,
"eos_token_id": 2,
"gradient_checkpointing": [
false
],
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13696,
"model_max_length": 4096,
"model_type": "baichuan",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.33.1",
"use_cache": true,
"vocab_size": 64000
}
, 'max_req_num': 1000, 'mode': [''], 'max_seq_length': 2048}
rank=1
2024-01-29T03:38:35.117515Z ERROR shard-manager: text_generation_launcher: {'tp_rank': 0, 'world_size': 2, 'weight_dir': '/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat', 'max_total_token_num': 10000, 'weight_dict': None, 'finetune_config': BaichuanConfig {
"_from_model_config": true,
"_name_or_path": "/home/mnt/sdc-share/models/vicuna-baichuan-13b-chat",
"architectures": [
"BaichuanForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_baichuan.BaichuanConfig",
"AutoModelForCausalLM": "modeling_baichuan.BaichuanForCausalLM"
},
"bos_token_id": 1,
"eos_token_id": 2,
"gradient_checkpointing": [
false
],
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13696,
"model_max_length": 4096,
"model_type": "baichuan",
"num_attention_heads": 40,
"num_hidden_layers": 40,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.33.1",
"use_cache": true,
"vocab_size": 64000
}
, 'max_req_num': 1000, 'mode': [''], 'max_seq_length': 2048}
rank=0
2024-01-29T03:38:35.122228Z ERROR shard-manager: text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.9/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 70, in serve
server.serve(model_id, revision, sharded, quantize, trust_remote_code, mode.split(","), uds_path)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 172, in serve
asyncio.run(serve_inner(model_id, revision, sharded, quantize, trust_remote_code))
File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 634, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
self._run_once()
File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
handle._run()
File "/opt/conda/lib/python3.9/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 139, in serve_inner
model = get_model(model_id, revision, sharded, quantize, trust_remote_code, mode)
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/init.py", line 323, in get_model
return LightLLM(
File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/light_llm.py", line 404, in init
self.model = Baichuan13bTpPartModel(model_kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/baichuan13b/model.py", line 21, in init
super().init(kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/llama/model.py", line 35, in init
super().init(kvargs)
File "/opt/conda/lib/python3.9/site-packages/lightllm/common/basemodel/basemodel.py", line 49, in init
self._verify_params()
File "/opt/conda/lib/python3.9/site-packages/lightllm/models/baichuan13b/model.py", line 26, in _verify_params
assert self.mode == [ ], "baichuan13b only support normal mode"
AssertionError: baichuan13b only support normal mode

@bingo787 bingo787 added the bug Something isn't working label Jan 29, 2024
@bingo787 bingo787 changed the title [BUG] Baichuan model init failed [BUG] Baichuan13B model init failed Jan 29, 2024
@bingo787
Copy link
Contributor Author

错误原因在于下面这行代码:
https://github.com/ModelTC/lightllm/blob/main/lightllm/models/baichuan13b/model.py#L25

TGI上层传入的所谓空可能是['']

@hiworldwzj
Copy link
Collaborator

@bingo787 你可以修改tgi传入时候传入的参数来适配一下。

@hiworldwzj
Copy link
Collaborator

@bingo787 #277 还有你提的那个加载权重的pr,感觉逻辑上不能有效的兼容历史的情况。是不是可以修改实现,在有weight_dict 的时候就从 weight_dict 中加载,但是不要影响主要的处理流程。

@bingo787
Copy link
Contributor Author

@bingo787 #277 还有你提的那个加载权重的pr,感觉逻辑上不能有效的兼容历史的情况。是不是可以修改实现,在有weight_dict 的时候就从 weight_dict 中加载,但是不要影响主要的处理流程。

不影响主流程呀。

@hiworldwzj
Copy link
Collaborator

@bingo787 verify_load 的调用逻辑被改变了。有些模型会在verify_load的里面去校验和补充参数的。

@bingo787
Copy link
Contributor Author

#277 下面讨论吧

@bingo787
Copy link
Contributor Author

TGI上层适配解决

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants