Disclaimer: The views and opinions expressed in this blog are entirely my own and do not necessarily reflect the views of my current or any previous employer. This blog may also contain links to other websites or resources. I am not responsible for the content on those external sites or any changes that may occur after the publication of my posts.
End Disclaimer
“As a category, impossible is all the stuff that has never been done before.” -Steven Kotler
“Progress Compounds” -Bill Ackman
News
AIML
World’s first major act to regulate AI passed by the European parliament
How the A.I. That Drives ChatGPT Will Move Into the Physical World
Diffusion models from scratch, from a new theoretical perspective
What I learned from looking at 900 most popular open source AI tools
The Top 100 GenAI Consumer Apps a16z
I, Cyborg: Using Co-Intelligence
The State of Competitive Machine Learning
Some details on the 24,000 H100 GPU cluster used to train Llama-3
Painting
Lightning Struck a Flock of Witches, William Holbrook Beard (1824-1900), n.d., oil on cardboard, 12 1⁄4 × 17 5⁄8 in. (31.1 × 44.8 cm), Smithsonian American Art Museum
Crypto
What’s Behind the Bitcoin Price Surge? Vibes, Mostly
Cyber
'Inception attacks' on Meta VR headsets can trap users in a fake VR environment
Incognito Darknet Market Mass-Extorts Buyers, Sellers
Markets
Automakers Are Sharing Consumers’ Driving Behavior With Insurance Companies
The U.S. wildfire index: How and where wildfires spread across America
In Silicon Valley, Venture Capital Meets a Generational Shift
It’s Hard to Beat That 7% Mortgage Rate.
Inside the “standoff” over what office buildings are worth
Markets Still Expect Rate Cut In June
Misc
‘Jamming’: How Electronic Warfare Is Reshaping Ukraine’s Battlefields
This Nvidia Cofounder Could Have Been Worth $70 Billion. Instead He Lives Off The Grid
Back to the Future Timeline Visualized
Talking on the Moon: The quest to establish lunar wi-fi
In Mongolia, a Killer Winter Is Ravaging Herds and a Way of Life
Salary Needed to Live Comfortably in U.S. Cities
SpaceX celebrates major progress on the third flight of Starship
Paper
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
“In short, I-JEPA is able to learn high-level representations of object parts without discarding their localized positional information in the image.”
I-JEPA Methodology: Introduces a non-generative approach for self-supervised learning that predicts the representations of various target blocks within the same image from a single context block, using a masking strategy to guide towards semantic representations.
Key Components:
Utilizes Vision Transformers with a novel predictive architecture where the loss is computed entirely in representation space, encouraging the learning of more abstract, semantic features.
Emphasizes the importance of a well-designed masking strategy, demonstrating that predicting larger target blocks with a sufficiently informative context block enhances semantic learning.
Achieves significant computational efficiency and scalability, outperforming previous methods in terms of both computational resources and performance metrics across various tasks.
Empirical Evaluation:
Demonstrates superior performance in off-the-shelf representation capabilities and across a range of tasks including linear classification, object counting, and depth prediction, without the use of view data augmentations.
I-JEPA is shown to be competitive with and even outperform existing view-invariant pretraining methods on semantic tasks, offering a simpler model with less rigid inductive bias.
The scalability of I-JEPA is highlighted, with less compute required for pretraining compared to other methods, thanks to the efficiency of predicting in representation space.
Visualization and Ablations:
Provides insights into the functionality of the predictor in I-JEPA through visualization techniques, showing that it can accurately capture positional uncertainty and produce high-level object parts.
Ablation studies reinforce the importance of predicting in representation space and the effectiveness of the multi-block masking strategy for learning semantic representations.
Conclusion: I-JEPA presents a promising direction for self-supervised learning by efficiently learning semantic image representations without the need for hand-crafted data augmentations, demonstrating scalability and adaptability across a variety of tasks.
Podcast
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416
I-JEPA: The first AI model based on Yann LeCun’s vision for more human-like AI