Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash

Date: