langchain pipeline vram usage when loading model

I'm trying to load 6b 128b 8bit llama based model from file (note the model itself is an example, I tested others and got similar problems), the pipeline is completely eating up my 8gb of vram:

before pipeline generation

error during pipeline generation

My code:

from langchain.llms import HuggingFacePipeline
from langchain import PromptTemplate, LLMChain

import torch
from transformers import LlamaTokenizer, LlamaForCausalLM, LlamaConfig, pipeline

torch.cuda.set_device(torch.device("cuda:0"))

PATH = './models/wizardLM-7B-GPTQ-4bit-128g'
config = LlamaConfig.from_json_file(f'{PATH}/config.json')
base_model = LlamaForCausalLM(config=config).half()

torch.cuda.empty_cache()
tokenizer = LlamaTokenizer.from_pretrained(
    pretrained_model_name_or_path=PATH,
    low_cpu_mem_usage=True,
    local_files_only=True
)
torch.cuda.empty_cache()

pipe = pipeline(
    "text-generation",
    model=base_model,
    tokenizer=tokenizer,
    batch_size=1,
    device=0,
    max_length=100,
    temperature=0.6,
    top_p=0.95,
    repetition_penalty=1.2
)

How can I make the pipeline initiation consume less vram?

gpu: AMD® Radeon rx 6600 (8gb vram, rocm 5.4.2 & torch)

I want to mention that I managed to load the same model on other frameworks like "KoboldAI" or "text-generation-webui" so I know it should be possible.

To load the model "wizardLM-7B-GPTQ-4bit-128g" downloaded from huggingface and run it using with langchain on python.

pip list output:

    Package                  Version
------------------------ ----------------
accelerate               0.19.0
aiofiles                 23.1.0
aiohttp                  3.8.4
aiosignal                1.3.1
altair                   5.0.0
anyio                    3.6.2
argilla                  1.7.0
async-timeout            4.0.2
attrs                    23.1.0
backoff                  2.2.1
beautifulsoup4           4.12.2
bitsandbytes             0.39.0
certifi                  2022.12.7
cffi                     1.15.1
chardet                  5.1.0
charset-normalizer       2.1.1
chromadb                 0.3.23
click                    8.1.3
clickhouse-connect       0.5.24
cmake                    3.25.0
colorclass               2.2.2
commonmark               0.9.1
compressed-rtf           1.0.6
contourpy                1.0.7
cryptography             40.0.2
cycler                   0.11.0
dataclasses-json         0.5.7
datasets                 2.12.0
Deprecated               1.2.13
dill                     0.3.6
duckdb                   0.8.0
easygui                  0.98.3
ebcdic                   1.1.1
et-xmlfile               1.1.0
extract-msg              0.41.1
fastapi                  0.95.2
ffmpy                    0.3.0
filelock                 3.9.0
fonttools                4.39.4
frozenlist               1.3.3
fsspec                   2023.5.0
gradio                   3.28.3
gradio_client            0.2.5
greenlet                 2.0.2
h11                      0.14.0
hnswlib                  0.7.0
httpcore                 0.16.3
httptools                0.5.0
httpx                    0.23.3
huggingface-hub          0.14.1
idna                     3.4
IMAPClient               2.3.1
Jinja2                   3.1.2
joblib                   1.2.0
jsonschema               4.17.3
kiwisolver               1.4.4
langchain                0.0.171
lark-parser              0.12.0
linkify-it-py            2.0.2
lit                      15.0.7
llama-cpp-python         0.1.50
loralib                  0.1.1
lxml                     4.9.2
lz4                      4.3.2
Markdown                 3.4.3
markdown-it-py           2.2.0
MarkupSafe               2.1.2
marshmallow              3.19.0
marshmallow-enum         1.5.1
matplotlib               3.7.1
mdit-py-plugins          0.3.3
mdurl                    0.1.2
monotonic                1.6
mpmath                   1.2.1
msg-parser               1.2.0
msoffcrypto-tool         5.0.1
multidict                6.0.4
multiprocess             0.70.14
mypy-extensions          1.0.0
networkx                 3.0
nltk                     3.8.1
numexpr                  2.8.4
numpy                    1.24.1
nvidia-cublas-cu11       11.10.3.66
nvidia-cuda-cupti-cu11   11.7.101
nvidia-cuda-nvrtc-cu11   11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11        8.5.0.96
nvidia-cufft-cu11        10.9.0.58
nvidia-curand-cu11       10.2.10.91
nvidia-cusolver-cu11     11.4.0.1
nvidia-cusparse-cu11     11.7.4.91
nvidia-nccl-cu11         2.14.3
nvidia-nvtx-cu11         11.7.91
olefile                  0.46
oletools                 0.60.1
openai                   0.27.7
openapi-schema-pydantic  1.2.4
openpyxl                 3.1.2
orjson                   3.8.12
packaging                23.1
pandas                   1.5.3
pandoc                   2.3
pcodedmp                 1.2.6
pdfminer.six             20221105
Pillow                   9.3.0
pip                      23.0.1
plumbum                  1.8.1
ply                      3.11
posthog                  3.0.1
psutil                   5.9.5
pyarrow                  12.0.0
pycparser                2.21
pydantic                 1.10.7
pydub                    0.25.1
Pygments                 2.15.1
pygpt4all                1.1.0
pygptj                   2.0.3
pyllamacpp               2.3.0
pypandoc                 1.11
pyparsing                2.4.7
pyrsistent               0.19.3
python-dateutil          2.8.2
python-docx              0.8.11
python-dotenv            1.0.0
python-magic             0.4.27
python-multipart         0.0.6
python-pptx              0.6.21
pytorch-triton-rocm      2.0.1
pytz                     2023.3
pytz-deprecation-shim    0.1.0.post0
PyYAML                   6.0
red-black-tree-mod       1.20
regex                    2023.5.5
requests                 2.28.1
responses                0.18.0
rfc3986                  1.5.0
rich                     13.0.1
RTFDE                    0.0.2
scikit-learn             1.2.2
scipy                    1.10.1
semantic-version         2.10.0
sentence-transformers    2.2.2
sentencepiece            0.1.99
setuptools               66.0.0
six                      1.16.0
sniffio                  1.3.0
soupsieve                2.4.1
SQLAlchemy               2.0.15
starlette                0.27.0
sympy                    1.11.1
tabulate                 0.9.0
tenacity                 8.2.2
threadpoolctl            3.1.0
tokenizers               0.13.3
toolz                    0.12.0
torch                    2.0.1+rocm5.4.2
torchaudio               2.0.2+rocm5.4.2
torchvision              0.15.2+rocm5.4.2
tqdm                     4.65.0
transformers             4.30.0.dev0
triton                   2.0.0
typer                    0.9.0
typing_extensions        4.4.0
typing-inspect           0.8.0
tzdata                   2023.3
tzlocal                  4.2
uc-micro-py              1.0.2
unstructured             0.6.6
urllib3                  1.26.13
uvicorn                  0.22.0
uvloop                   0.17.0
watchfiles               0.19.0
websockets               11.0.3
wheel                    0.38.4
wikipedia                1.4.0
wrapt                    1.14.1
XlsxWriter               3.1.0
xxhash                   3.2.0
yarl                     1.9.2
zstandard                0.21.0


Comments

Popular posts from this blog

Today Walkin 14th-Sept

Spring Elasticsearch Operations

Hibernate Search - Elasticsearch with JSON manipulation