• brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    13 days ago

    For RAG data? It works.

    But its too slow for the weights. What generative models fundamentally do is run a full pass through the multi-gigabyte weights for every ‘word’ or diffusion step, so even 128-bit DDR5 like you find on desktop CPUs is too slow.