Nancy's Notebook

Paper Note: Swin Transformer

A new ViT whose representation is computed with Shifted windows*!*** ...

Paper Note: Masked autoencoders(MAE) (very short)

Masked autoencoders (MAE) are scalable self-supervised learners for computer vision. ...

Paper Note: ViT

ViT applies a standard Transformer directly to images ...

Paper Note: Attention is All You Need

The Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder architectures with multi-headed self-attention. ...

Paper Note: BERT

BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. ...