
Samba/pretrain.py at main · microsoft/Samba · GitHub
Should rewrite it in the future. if resume: if curr_iter < initial_iter: curr_iter += 1 continue else: resume = False curr_iter = -1 fabric.barrier () fabric.print ("resume finished, taken {} seconds".format …
Samba: Simple Hybrid State Space Models for Efficient ... - GitHub
Samba is a simple yet powerful hybrid model with an unlimited context length. Its architecture is frustratingly simple: Samba = Mamba + MLP + Sliding Window Attention + MLP stacking at the layer …
Samba/lit_gpt/model.py at main · microsoft/Samba · GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - Samba/lit_gpt/model.py at main · microsoft/Samba
Pulse · microsoft/Samba · GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - Pulse · microsoft/Samba
LitGPT · Issue #6 · microsoft/Samba - GitHub
Jun 14, 2024 · Congrats on this research milestone 🙌! And it’s nice to see that our LitGPT library has been has been useful for this project. However, note that LitGPT is an open-source project, and the …
Models · Issue #4 · microsoft/Samba - GitHub
Jun 12, 2024 · [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - microsoft/Samba
finetune code · Issue #20 · microsoft/Samba · GitHub
I'm currently pretraining with the Samba architecture. But I want to pretrain this model and finetune it to suit a specific task. Wondering if there's any related code or material I can help with.
Error when using Docker · Issue #10 · microsoft/Samba · GitHub
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Inferrence Code · Issue #8 · microsoft/Samba · GitHub
Jun 18, 2024 · Amazing work, team! Thank you sincerely for sharing. I have trained a toy model but have completely failed creating an inference script. Sharing one would be sincerely appreciated!
Samba/assets/Samba-pic.webp at main · microsoft/Samba · GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling - microsoft/Samba