The World of AI - Your Monthly Update

View in browser

This Month in AI

Industry

🤖 MERLOT RESERVE broke the state-of-the-art for multi-modal machine learning. Trained on 20 million YouTube videos, it processes subtitles, audio, and video frames to learn combined vision-language representations.

🐢 SEE Turtles—a conservation group—is using classifiers to identify illegally traded tortoiseshell from Hawksbill turtles, which are critically endangered and have shells highly prized on illicit markets.

💻 Razer pairs with Lambda to release the Lambda Tensorbook, a notebook with a 16 GB NVIDIA RTX 3080 Max-Q GPU and Lambda’s deep learning software.

At V7

🙋‍♀️ V7 launched a Fake Profile Detector experiment to combat fake news, misinformation, and scams on the internet. You can install the Chrome extension and right-click on images to check whether a profile picture contains a Style GAN-generated face.

Install the extension and learn more about it on Linus Tech Tips and Petapixel.

Read more

Research

🦩 Deepmind releases Flamingo, an 80-bn-parameter visual language model (VLM) that breaks the SOTA for 16 few-shot learning tasks, including visual question answering (VQA), hateful content classification, and captioning.

🕞 OpenAI released DALL·E 2 this month, the sequel to the original text-to-image generation model—this time using CLIP as a latent distribution of image embeddings given a text caption.

Check out the interactive demo by OpenAI and a technical overview of the paper by AssemblyAI, including some intuition on CLIP embeddings.

🤓 Stanford researchers are trying to theoretically understand why Batch-norm works in convex optimization and deep neural networks. In a paper released last March, the team formulated a theoretical understanding of its state-of-the-art results in the context of convex duality.

📈 Boris Dayma shared his findings on Twitter—and not on Arxiv—after training large transformers for 2,000+ hours. Check out some of his tips, including: don’t use bias in dense layers; use GeLU or Swish as an activation as opposed to SmeLU, and Normformers is more stable than Sandwich-LN.

💬 Hot takes from the AI Community

👀 Who's hiring...

1️⃣ V7: Research Engineer (Deep Learning, Computer Vision)

2️⃣ Intenseye: Applied Research Engineer (Machine Learning / Ops)

3️⃣ Tractable: Engineering Team Lead

See you in May! 👋