You will need more than 24 GB of VRAM to quantize a 7B parameter model like mistral this way. I didn't have that, so I borrowed some H100 time on vast.ai for 5 dollars.
On an cloud gpu, run a docker instance from an image created by the following script:
git lfs install
git clone https://github.com/NVIDIA/TensorRT-LLM.git -b v0.8.0
cd TensorRT-LLM