OpenGVLab
/

InternVL2-40B

Image-Text-to-Text

feature-extraction

Model card Files Files and versions

czczup commited on Jul 25, 2024

Commit

17a1cb6

·

verified ·

1 Parent(s): 33ddb3c

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -112,7 +112,7 @@ We welcome MLLM benchmark developers to assess our InternVL1.5 and InternVL2 ser
 We provide an example code to run InternVL2-40B using `transformers`.
-We also welcome you to experience the InternVL2 series models in our [online demo](https://internvl.opengvlab.com/).
 > Please use transformers==4.37.2 to ensure the model works normally.
@@ -162,7 +162,7 @@ def split_model(model_name):
     device_map = {}
     world_size = torch.cuda.device_count()
     num_layers = {
-        'InternVL2-1B': 24, 'InternVL2-2B': 24, 'InternVL2-4B': 32, 'InternVL2-8B': 32,
         'InternVL2-26B': 48, 'InternVL2-40B': 60, 'InternVL2-Llama3-76B': 80}[model_name]
     # Since the first GPU will be used for ViT, treat it as half a GPU.
     num_layers_per_gpu = math.ceil(num_layers / (world_size - 0.5))
@@ -284,7 +284,7 @@ def split_model(model_name):
     device_map = {}
     world_size = torch.cuda.device_count()
     num_layers = {
-        'InternVL2-1B': 24, 'InternVL2-2B': 24, 'InternVL2-4B': 32, 'InternVL2-8B': 32,
         'InternVL2-26B': 48, 'InternVL2-40B': 60, 'InternVL2-Llama3-76B': 80}[model_name]
     # Since the first GPU will be used for ViT, treat it as half a GPU.
     num_layers_per_gpu = math.ceil(num_layers / (world_size - 0.5))

 We provide an example code to run InternVL2-40B using `transformers`.
+We also welcome you to experience the InternVL2 series models in our [online demo](https://internvl.opengvlab.com/).
 > Please use transformers==4.37.2 to ensure the model works normally.
     device_map = {}
     world_size = torch.cuda.device_count()
     num_layers = {
+        'InternVL2-1B': 24, 'InternVL2-2B': 24, 'InternVL2-4B': 32, 'InternVL2-8B': 32,
         'InternVL2-26B': 48, 'InternVL2-40B': 60, 'InternVL2-Llama3-76B': 80}[model_name]
     # Since the first GPU will be used for ViT, treat it as half a GPU.
     num_layers_per_gpu = math.ceil(num_layers / (world_size - 0.5))
     device_map = {}
     world_size = torch.cuda.device_count()
     num_layers = {
+        'InternVL2-1B': 24, 'InternVL2-2B': 24, 'InternVL2-4B': 32, 'InternVL2-8B': 32,
         'InternVL2-26B': 48, 'InternVL2-40B': 60, 'InternVL2-Llama3-76B': 80}[model_name]
     # Since the first GPU will be used for ViT, treat it as half a GPU.
     num_layers_per_gpu = math.ceil(num_layers / (world_size - 0.5))