bfloat16 - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

C++高性能并行编程与优化 - 课件 - 10 从稀疏数据结构到量化数据类型

![Image](/uploads/documents/9/b/e/8/9be89f35442491e3b52b86402e3652be/p1_1.jpg) bfloat16 range: ~1e^{-38} to ~3e^{38} float32 range: ~1e^{-38} to ~3e^{38} float16 range: ~5.9e $ ^{-8} $ to 6.5e jpg) ## 转换起来简单一点的：bfloat16（大指数版） - 另一种简单的方法，就是直接暴力地把 32 位浮点从 16 位切断，只取出高 16 位，当做一种非标准的 half 来存储。称为 bfloat16（前面多个 b）。 - 因为 bfloat16 是从 float 中之间暴力切断出来的，所以只有底数被切断了，指数完全没有变。 • bfloat16 具有 8 位指数，7 位底数。 - float16 具有 5 位指数，10 位底数。 - 可见 bfloat16 的指数部分占得比较多，而底数就很少，这样会有一点不精确，优点是和 float之间转换的位运算实现起来比较简单。double: ![Image](/uploads/documents/9/b/e/8/9be89f35442491e3b52b86402e3652be/p89_1.jpg) ![Image](

0 码力 | 102 页 | 9.50 MB | 2 年前
3
vLLM v0.5.0 Documentation

[--load-format {auto,pt,safetensors,npcache, →dummy,tensorizer,bitsandbytes}] [--dtype {auto,half,float16,bfloat16,float, →float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_ →e4m3}] [--quantization-param-path MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_ →SIZE] [--lora-dtype {auto,float16,bfloat16, →float32}] [--long-lora-scaling-factors LONG_LORA_ →SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] using bitsandbytes quantization. Default:“auto”| |--dtype|Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. •“auto” will use FP16 precision for

0 码力 | 132 页 | 1.05 MB | 5 月前
3
vLLM v0.4.3 Documentation

[--load-format {auto,pt,safetensors,npcache, →dummy,tensorizer}] [--dtype {auto,half,float16,bfloat16,float, →float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_ →e4m3}] [--quantization-param-path LORA_EXTRA_VOCAB_ EXTRA_CONFIG] ## [--long-lora-scaling-factors LONG_LORA_ ## [--lora-dtype {auto,float16,bfloat16, [SERVED_MODEL_NAME ...] TOKENS] MAX_MODEL_LEN] [--device {auto,cuda,neuron,cpu}] [--image-input-type Examplessection for more information. Default: "auto" --dtype Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. - "auto" will use FP16 precision

0 码力 | 121 页 | 1.02 MB | 5 月前
3
vLLM v0.5.0.post1 Documentation

[--load-format {auto,pt,safetensors,npcache, →dummy,tensorizer,bitsandbytes}] [--dtype {auto,half,float16,bfloat16,float, →float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_ →e4m3}] [--quantization-param-path MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_ →SIZE] [--lora-dtype {auto,float16,bfloat16, →float32}] [--long-lora-scaling-factors LONG_LORA_ →SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] bitsandbytes quantization. ## Default: "auto" ## --dtype Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. - "auto" will use FP16 precision

0 码力 | 144 页 | 1.09 MB | 5 月前
3
vLLM v0.4.0.post1 Documentation

DOWNLOAD_DIR] [--load-format {auto,pt,safetensors,npcache,dummy}] [--dtype {auto,half,float16,bfloat16,float,float32}] [--kv-cache-dtype {auto,fp8_e5m2}] [--max-model-len MAX_MODEL_LEN] [--worker-use-ray] next page) |\[--lora-extra-vocab-size LORA\_EXTRA\_VOCAB\_SIZE]| |---| |[--lora-dtype{auto,float16,bfloat16,float32}| |\[--max-cpu-loras MAX\_CPU\_LORAS]| |[--device{auto,cuda,neuron,cpu}| |[--image-inpu values,which is mainly for profiling。 Default:“auto”| |--dtype|Possible choices:auto,half,float16,bfloat16,float,float32 data type for model weights and activations.The“auto”option will use FP16 precision

0 码力 | 68 页 | 810.15 KB | 5 月前
3
vLLM v0.6.1.post2 Documentation

→state,gguf,bitsandbytes,mistral}] [--config-format {auto,hf,mistral}] [--dtype {auto,half,float16,bfloat16,float,float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_e4m3}] [--quantization-param-path QU [--max-lora-rank MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_SIZE] [--lora-dtype {auto,float16,bfloat16,float32}] [--long-lora-scaling-factors LONG_LORA_SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] available else it will try to load in mistral format --dtype Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. "auto" will use FP16 precision

0 码力 | 215 页 | 1.29 MB | 5 月前
3
vLLM v0.6.1.post1 Documentation

→state,gguf,bitsandbytes,mistral}] [--config-format {auto,hf,mistral}] [--dtype {auto,half,float16,bfloat16,float,float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_e4m3}] [--quantization-param-path QU [--max-lora-rank MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_SIZE] [--lora-dtype {auto,float16,bfloat16,float32}] [--long-lora-scaling-factors LONG_LORA_SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] available else it will try to load in mistral format --dtype Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. "auto" will use FP16 precision

0 码力 | 215 页 | 1.28 MB | 5 月前
3
vLLM v0.5.2 Documentation

[--load-format {auto,pt,safetensors,npcache, →dummy,tensorizer,bitsandbytes}] [--dtype {auto,half,float16,bfloat16,float, →float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_ →e4m3}] [--quantization-param-path MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_ →SIZE] [--lora-dtype {auto,float16,bfloat16, →float32}] [--long-lora-scaling-factors LONG_LORA_ →SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] using bitsandbytes quantization. Default:“auto”| |--dtype|Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. •“auto” will use FP16 precision for

0 码力 | 166 页 | 1.15 MB | 5 月前
3
vLLM v0.5.5 Documentation

safetensors,npcache,dummy,tensorizer,sharded_ →state,gguf,bitsandbytes}] [--dtype {auto,half,float16,bfloat16,float,float32}] [--kv-cache-dtype {auto,fp8,fp8_e5m2,fp8_e4m3}] [--quantization-param-path QU [--max-lora-rank MAX_LORA_RANK] [--lora-extra-vocab-size LORA_EXTRA_VOCAB_SIZE] [--lora-dtype {auto,float16,bfloat16,float32}] [--long-lora-scaling-factors LONG_LORA_SCALING_FACTORS] [--max-cpu-loras MAX_CPU_LORAS] (added to the base model vocabulary). Default: 256 --lora-dtype Possible choices: auto, float16, bfloat16, float32 Data type for LoRA. If auto, will default to base model dtype. Default: "auto"

0 码力 | 193 页 | 1.22 MB | 5 月前
5
vLLM v0.4.1 Documentation

the serialized weights. Default: "auto" --dtype Possible choices: auto, half, float16, bfloat16, float, float32 Data type for model weights and activations. "auto" will use FP16 precision (added to the base model vocabulary). Default: 256 --lora-dtype Possible choices: auto, float16, bfloat16, float32 Data type for LoRA. If auto, will default to base model dtype. Default: "auto" [--load-format {auto,pt,safetensors,npcache, →dummy,tensorizer}] [--dtype {auto,half,float16,bfloat16,float, →float32}] [--kv-cache-dtype {auto,fp8}] [--quantization-param-path QUANTIZATION_ →PARAM_PATH]

0 码力 | 101 页 | 894.09 KB | 5 月前
3

共 22 条前往

页

分类

语言

格式

C++高性能并行编程与优化 - 课件 - 10 从稀疏数据结构到量化数据类型

vLLM v0.5.0 Documentation

vLLM v0.4.3 Documentation

vLLM v0.5.0.post1 Documentation

vLLM v0.4.0.post1 Documentation

vLLM v0.6.1.post2 Documentation

vLLM v0.6.1.post1 Documentation

vLLM v0.5.2 Documentation

vLLM v0.5.5 Documentation

vLLM v0.4.1 Documentation

搜索

分类

语言

格式