PinnedPublished inML@ABEJATraining a better GPT model: learnings from PaLMRevisiting the model released in 2018Jun 16, 2022Jun 16, 2022
Published inML@ABEJABeyond Limits: Unifying Strategies to Amplify Transformer ModelsEnriching Computational Workspaces across Machine Learning DomainsNov 16, 2023Nov 16, 2023
Published inML@ABEJAFlexible batchsizing for dataset with very diverse sequence lengthsUsing GPU compute more efficiently (for transformers)Oct 25, 2023Oct 25, 2023
Published inML@ABEJAExtending AutoModel class from Hugging FaceA hacky way to keep the code structure consistentApr 2, 2023Apr 2, 2023
Published inDive into ML/AIUsing GPU instance on GCP from VS CodeStep by step tutorialDec 15, 2022Dec 15, 2022
Published inDive into ML/AIExponential Moving Average attention for landmarks based action recognitionUsing inductive bias to replace standard multi-head attentionOct 20, 2022Oct 20, 2022
Published inDive into ML/AITree data visualization with treelibA straightforward solution for visualizing an often used data structureOct 8, 2022Oct 8, 2022
Published inDive into ML/AICaution while using nn.Parameter in PytorchAvoid this mistake that cost me a month’s worth of lost experimentsOct 4, 2022Oct 4, 2022
Published inDive into ML/AIMigrating from Intel to Apple SiliconIssues and the hacky solutions.Sep 28, 2022Sep 28, 2022