Issue #257

September 25, 2024

This week's book is "Why Machines Learn: The Elegant Math Behind Modern AI" by A. Ananthaswamy. The book introduces the main ideas and developments of Artificial Intelligence clearly and concisely. Starting with the invention of the Perceptron in the 50s, through all the significant developments of the last several decades, such as Support Vector Machines, Hopfield Networks, and Backpropagation, to the latest developments in Large Language Models. Ananthaswamy explains how they fit in the historical development of Computer Science and AI, as well as how they connect to insights originating in biology and psychology.

The book targets a general audience familiar with basic math. Mathematical concepts such as probability and linear algebra are introduced in an intuitive way that provides just enough detail to understand the more technical parts of the text. Overall, a great resource whether your reviewing these concepts or encountering them for the first time.

Why Machines Learn: The Elegant Math Behind Modern AI

1. The Pragmatic Programmer for Machine Learning [ppml.dev]
2. Andrey Markov & Claude Shannon Counted Letters to Build the First Language-Generation Models [spectrum.ieee.org]
3. How ‘Embeddings’ Encode What Words Mean — Sort Of [quantamagazine.org]
4. Python in Excel – Available Now [techcommunity.microsoft.com]
5. 25 Amazing Python Tricks That Will Instantly Improve Your Code [medium.com/pythoneers]
6. ICML 2024 Top Papers : What’s New in Machine Learning? [blog.cubed.run]
7. The Math Behind Kernel Density Estimation [towardsdatascience.com]

• Complexity and entropy of natural patterns (H. Wang, C. Song, P. Gao)
• Replay shapes abstract cognitive maps for efficient social navigation (J.-Y. Son, M.-L. Vives, A. Bhandari, O. FeldmanHall)
• Durably reducing conspiracy beliefs through dialogues with AI (T. H. Costello, G. Pennycook, D. G. Rand)
• Training Language Models to Self-Correct via Reinforcement Learning (A. Kumar, V. Zhuang, R. Agarwal, Y. Su, J. D. Co-Reyes, A. Singh, K. Baumli, S. Iqbal, C. Bishop, R. Roelofs, L. M. Zhang, K. McKinney, D. Shrivastava, C. Paduraru, G. Tucker, D. Precup, F. Behbahani, A. Faust)
• Schrodinger's Memory: Large Language Models (W. Wang, Q. Li)
• A Primer on the Inner Workings of Transformer-based Language Models (J. Ferrando, G. Sarti, A. Bisazza, M. R. Costa-jussà)
• Chain of Thought Empowers Transformers to Solve Inherently Serial Problems (Z. Li, H. Liu, D. Zhou, T. Ma)

How to Read Deep Learning Papers as a Software Engineer

September 25, 2024

​