PinnedAnuj ArorainML@ABEJATraining a better GPT model: learnings from PaLMRevisiting the model released in 2018Jun 16, 2022Jun 16, 2022
Anuj ArorainML@ABEJABeyond Limits: Unifying Strategies to Amplify Transformer ModelsEnriching Computational Workspaces across Machine Learning DomainsNov 16, 2023Nov 16, 2023
Anuj ArorainML@ABEJAFlexible batchsizing for dataset with very diverse sequence lengthsUsing GPU compute more efficiently (for transformers)Oct 25, 2023Oct 25, 2023
Anuj ArorainML@ABEJAExtending AutoModel class from Hugging FaceA hacky way to keep the code structure consistentApr 2, 2023Apr 2, 2023
Anuj ArorainDive into ML/AIUsing GPU instance on GCP from VS CodeStep by step tutorialDec 15, 2022Dec 15, 2022
Anuj ArorainDive into ML/AIExponential Moving Average attention for landmarks based action recognitionUsing inductive bias to replace standard multi-head attentionOct 20, 2022Oct 20, 2022
Anuj ArorainDive into ML/AITree data visualization with treelibA straightforward solution for visualizing an often used data structureOct 8, 2022Oct 8, 2022
Anuj ArorainDive into ML/AICaution while using nn.Parameter in PytorchAvoid this mistake that cost me a month’s worth of lost experimentsOct 4, 2022Oct 4, 2022
Anuj ArorainDive into ML/AIMigrating from Intel to Apple SiliconIssues and the hacky solutions.Sep 28, 2022Sep 28, 2022