• Mikina@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    4 months ago

    I’m not sure what “FP16/FP8/INT4” means, and where would GTX 4090 fall in those categories, but the VRAM required is respectively 810Gb/403Gb/203Gb. I guess 4090 would fall under the INT4?

    • Kevin@programming.dev
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      4 months ago

      They stand for Floating Point 16-bit, 8-bit and 4 bit respectively. Normal floating point numbers are generally 32 or 64 bits in size, so if you’re willing to sacrifice some range, you can save a lot of space used by the model. Oh, and it’s about the model rather than the GPU