-
TheBirdLegacy/FreeLoaderLM
Text Generation • Updated -
CofeAI/FLM-101B
Text Generation • Updated • 15 • 92 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 45 -
Composable Function-preserving Expansions for Transformer Architectures
Paper • 2308.06103 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2309.12307
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
Challenges and Applications of Large Language Models
Paper • 2307.10169 • Published • 51 -
Efficiently Modeling Long Sequences with Structured State Spaces
Paper • 2111.00396 • Published • 3 -
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
Paper • 2006.08381 • Published
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
One Wide Feedforward is All You Need
Paper • 2309.01826 • Published • 34 -
Self-Alignment with Instruction Backtranslation
Paper • 2308.06259 • Published • 43 -
Shepherd: A Critic for Language Model Generation
Paper • 2308.04592 • Published • 33
-
TheBirdLegacy/FreeLoaderLM
Text Generation • Updated -
CofeAI/FLM-101B
Text Generation • Updated • 15 • 92 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 45 -
Composable Function-preserving Expansions for Transformer Architectures
Paper • 2308.06103 • Published • 21
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
One Wide Feedforward is All You Need
Paper • 2309.01826 • Published • 34 -
Self-Alignment with Instruction Backtranslation
Paper • 2308.06259 • Published • 43 -
Shepherd: A Critic for Language Model Generation
Paper • 2308.04592 • Published • 33
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 79 -
Challenges and Applications of Large Language Models
Paper • 2307.10169 • Published • 51 -
Efficiently Modeling Long Sequences with Structured State Spaces
Paper • 2111.00396 • Published • 3 -
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
Paper • 2006.08381 • Published