Introduction
Recently, I’ve been exploring large language models (LLMs) and came across Moonshot AI, a company that really caught my attention. Their flagship model Kimi K2 claims to handle up to 128K tokens of context and uses a cutting-edge Mixture of Experts (MoE) architecture.
About 3 min