PinnedAnuj ArorainML@ABEJATraining a better GPT model: learnings from PaLMRevisiting the model released in 2018·6 min read·Jun 16, 2022----
Anuj ArorainML@ABEJABeyond Limits: Unifying Strategies to Amplify Transformer ModelsEnriching Computational Workspaces across Machine Learning Domains·5 min read·Nov 16, 2023----
Anuj ArorainML@ABEJAFlexible batchsizing for dataset with very diverse sequence lengthsUsing GPU compute more efficiently (for transformers)·5 min read·Oct 25, 2023----
Anuj ArorainML@ABEJAExtending AutoModel class from Hugging FaceA hacky way to keep the code structure consistent·2 min read·Apr 2, 2023----
Anuj ArorainML@ABEJASetting up a FastAPI server with GPUUsing serve from ray·3 min read·Mar 25, 2023----
Anuj ArorainDive into ML/AIUsing GPU instance on GCP from VS CodeStep by step tutorial·4 min read·Dec 15, 2022----
Anuj ArorainDive into ML/AIExponential Moving Average attention for landmarks based action recognitionUsing inductive bias to replace standard multi-head attention·2 min read·Oct 20, 2022----
Anuj ArorainDive into ML/AITree data visualization with treelibA straightforward solution for visualizing an often used data structure·2 min read·Oct 8, 2022----
Anuj ArorainDive into ML/AICaution while using nn.Parameter in PytorchAvoid this mistake that cost me a month’s worth of lost experiments·2 min read·Oct 4, 2022----
Anuj ArorainDive into ML/AIMigrating from Intel to Apple SiliconIssues and the hacky solutions.·4 min read·Sep 28, 2022----