RedPajama is “a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens”. It’s a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, …
GitHub - Zjh-819/LLMDataHub: A quick guide (especially) for
Red Pajama: An Open-Source Llama Model
RedPajama: New Open-Source LLM Reproducing LLaMA Training Dataset
Dolma, OLMo, and the Future of Open-Source LLMs
LLMs의 기이한 세계에 대해 알아보기 – Jini AI
Data collection for LLMs - Argilla 1.14 documentation
What is RedPajama? - by Michael Spencer
Supervised Fine-tuning: customizing LLMs
Web LLM runs the vicuna-7b Large Language Model entirely in your
From ChatGPT to LLaMA to RedPajama: I'm Switching My Interest to