10.11.2021

Articles on Artificial Intelligence & Machine Learning Recommended by Wojtek Ptak & Jon Bratseth

Another bunch of must-reads from IT experts. This time Wojtek Ptak & Jon Bratseth share their recommendations for AI & ML enthusiasts.

10.11.2021, added by Infoshare

We continue the series of must-reads about programming languages and software development recommended by IT experts. You already can check out Vitaly Friedman’s picks on Front-end & UX/UI, Cloud & DevOps recommendations by Tomasz Konieczny, propositions of insightful articles on Java by Tomasz Borek.

Now, it’s time to dive deep into AI and machine learning with

Wojtek Ptak – CTO of Talent Alpha
Jon Bratseth – a VP architect in the Big Data and AI group of Verizon Media

The two are the curators of the AI/ML/Data Science section of the Infoshare newsletter dedicated to developers, where they share articles worth reading and some useful tips. For those who haven't subscribed to our newsletter yet, we share some of Wojtek's and Jon's recommendations.

Wojtek Ptak

Wojtek Ptak foto

CTO of Talent Alpha, where he is responsible for leading the product team. He also drives company product strategy, data strategy, AI / ML development, cloud infrastructure, and software architecture of the platform and apps behind the Talent Alpha.

One of the Best Explanations of Reinforcement Learning and Deep Reinforcement Learning

Steve Brunton is Associate Professor of Applied Mathematics at the University of Washington, known for his really well-delivered videos on topics related to mathematics, especially around Artificial Intelligence and control theory. If you're interested in the topic of reinforcement learning and deep reinforcement learning, Steve Brunton’s explanation will give you a good starting point and enable drilling further with more confidence.

The Values Encoded in Machine Learning Research

Last month was full of events that put emphasis on ethics and AI. The most prominent one was the UN's moratorium on the sale and use of AI systems. To quote Michelle Bachelet, UN High Commissioner for Human Rights: "Artificial Intelligence can be a force for good. But AI technologies can have negative, even catastrophic, effects if they are used without sufficient regard to how they affect people's human rights." You can read more on the UN's website.

That made me dig a little deeper into the topic, and I found an excellent thought-provoking paper (aren't such the best ones?) published only a few months ago: The Values Encoded in Machine Learning Research. Researchers from Stanford University, the University of California, and the University College Dublin cover a rigorous review of the field's values in 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. They examined them quantitatively and qualitatively. Some striking highlights:
- 71% of publications "justify how it achieves technical goals but no mention of societal need."
- 98% of papers "do not mention negative potential."

I will be genuinely interested in your thoughts.

ETA Prediction with Graph Neural Networks in Google Maps

Graphs and Graph Neural Networks are definitely gaining momentum in the applied ML/AI fields. Why should you be interested in that? It's probably the most natural way to describe many real-life concepts, like maps, social networks, connections between entities, like—in our case— knowledge taxonomies. Together with our team, we're exploring different ways of representing our data model using graphs ourselves. They produce excellent results. Graph Neural Networks are often used in recommenders.
Google lately published an excellent paper on an actual use case for GNNs: ETA Prediction with Graph Neural Networks in Google Maps. It covers the approach, architecture, training, evaluation, and challenges with the final model. In this paper, GNNs are basically applied to solve an actual use case with great results, and it lays out the approach nicely (more clear publications like this one, please!).
You can also check out the paper overview with an explanation on YouTube.

Bonus: DeepMind's Reinforcement Learning Lecture Series 2021

Leader in the field, DeepMind, publishes an excellent series together with University College London and includes Coursera Specialisation. One that should not be missed. Enjoy!

Facebook’s "Learning From Videos to Understand the World" Project

With the rise of Tik Tok and its famous recommendation engine, Facebook was left behind with a new approach to serving video content. The new project itself is unprecedented in its holistic approach and size. It's definitely improving Reels on Facebook, but it also enables many new features—adaptation to a fast-moving world, recognition of nuances in videos on a new scale—and some new interesting possibilities. Facebook also released its smart glasses this year, and with a system like this, well, it can perform so-called world-scrapping—capturing granular data about the world by turning wearers of these products into live-feed walking cameras.

The Impact of Producing Large-Scale Models on the Environment

It might be a couple of months since its publication, however, I believe this article on Green AI is truly important to get an interesting perspective of some of the work being carried out with large-scale models. I came across this scientific article while listening to a very inspirational conversation between Oren Etzioni (CEO of Allen Institute for AI, Professor Emeritus) and Kevin Scott (CTO of Microsoft), on Kevin's podcast Behind The Tech. To cite the entering paragraph: "An important paper has estimated the carbon footprint of several NLP models and argued this trend is both environmentally unfriendly and prohibitively expensive, raising barriers to participation in NLP research."

Codex and GitHub’s Copilot

GitHub’s Copilot sets an entirely new standard for developer’s tools and assistance, with fully blown AI code suggestions based on GPT-style transformers, which can generate whole methods based on methods names, parameters, or developer comments. It means excellent news, especially for developers who are aspiring to new technologies. With a growing need for tech talents, tools like this can only help and benefit most of us. On the other hand, it raises many valid questions about its quality or security. An example is whether an insecure code can be suggested to a developer.

The paper published on Codex by the OpenAI team covers various aspects of a dedicated and specialized large language model engine. Codex sits behind Copilot, so you can take a sneak peek into how the team designed, trained, and evaluated their work. For Python, only the training dataset was around 159GB of code from 59 million public repositories, while the biggest model mentioned in the paper is Codex-12B. Yep, 12 billion parameters. The interesting point was that it required “a significant fraction of publicly available (…) code on GitHub, totaling hundreds of millions of lines of code”, while the best of developers do not see even a fraction of this amount of code. With that in mind, it shows the complexity of problems we are solving and still how much we can still do in these fields. “A strong student who completes an introductory computer science course is expected to be able to solve a larger fraction of problems than Codex-12B.”

NVidia’s Alias-Free GAN

My next pick is NVidia’s work on GANs, especially the super realistic model based on StyleGAN2. You’ve seen some of it at work; it resulted in very natural generations of photo-realistic faces. The new model, Alias-free GAN, takes another step in this game – enables photo-realistic transformations of the image, or between shots, with fantastic attention to detail. You can go to the website and see it at work, not only with human faces but also with animals or landscapes.

This model brings NVidia a step closer to a whole range of new applications, services, and functionalities. We can see that progress that we see with these models will soon enable cinematic photo-realistic image generation and transformation. Applications are endless – from cinematography, photography, gaming to building video-conferencing tools which can use only head tracking to generate a realistic image of ourselves (that brings remote meetings to a whole dimension, doesn’t it?). I also recommend the recently published paper on it and this episode of Two Minute Papers.

MLP-Mixer: An all-MLP Architecture for Vision

While most of my time I work with NLP solutions, computer vision is always close to my heart (and brain), and one of this month’s picks couldn’t be anything else than MLP-Mixer. A project released by Google Brain teams from Berlin and Zurich brings image processing back to Multi-layer Perceptions in an incredible all-in-one architecture while achieving results close to SOTA models at almost three times better training speed. Quoting authors: “We believe these results open many questions. (…) It would be particularly interesting to see whether such a design works in NLP or other domains.”. I recommend following the approach proposed in this paper in the future. Excited!

Microsoft Introduces Its First Product Features Powered by GPT-3

There are several reasons I recommend spending some time reviewing the impact of products with features powered by GPT-3 as introduced by Microsoft. From my perspective, to begin with, the tech world experiences an enormous Talent Gap. There are estimations that while a number of IT specialists grow at ca. 4.5% per year (Statista), there will be ca. 145M new jobs on the market by the end of 2025 (Microsoft). This presents a number of significant challenges for the whole industry worldwide, only accelerated by COVID. We need to think of new solutions helping to speed up solutions experimentation, build, and delivery. This product is a bearer of changes to come, enabling more and less experienced folks to experiment more directly, which is great news.

On the other hand, it was just a question of when AI will aid us, IT specialists, to create new projects. And, I must say, I wasn’t expecting something of this scale so soon. Fantastic work from Microsoft for integrating GPT-3 and Power Apps to create low-code solutions using conversational language.

Jon Bratseth

Jon Bratseth photo

Jon Bratseth is a VP architect in the Big Data and AI group of Verizon Media, and the architect and one of the main contributors to Vespa.ai, the open big data serving engine. Jon has 20 years of experience as an architect and programmer on large distributed systems. He’s also a frequent public speaker.

Learning to Think it Through

Modern approaches to AI use a fixed amount of computation to come up with an answer to any problem, even though we know this isn't right. Deep Mind takes a stab at this problem with PonderNet, a system that allocates compute to a problem based on its complexity, and shows that it can achieve better results at a lower overall cost.

Is Your Data Broken?

A recent paper from Google Research discusses the pervasiveness of the phenomenon where 'everybody wants to do machine learning and nobody wants to clean the data' and how it leads to bad results, no matter the sophistication of your machine learning.

Make Every Feature Binary

Microsoft Research published details on the Bing relevance model and they are taking an unusual approach.
They encode detailed information, such as a relationship between a specific word in a query and another word in a document, as 135 billion binary parameters and train on that. They found this to scale better to more training data than other models. A proprietary large-scale training framework sis used to make this feasible.

Facebook Reveals Details on Their 12 Trillion Parameter Recommendation Model

Recommendation models that drive users' behavior online have a much larger impact on society than the more well-known language and vision models such as BERT, GPT, and DALL-E. As you would expect from that, it turns out they are also much larger. Read the full paper to see how Facebook is able to scale to learn such enormous models.

Huggingface Introduces Accelerate to Simplify Fast Training of PyTorch Models

The standard PyTorch API is well known to most data scientists, but to train larger models that require multiple nodes or GPU's it is necessary to learn additional abstractions and APIs. Now Huggingface is announcing Accelerate, an open-source project with the aim to make distributed training simply by staying with the standard PyTorch abstraction and supporting many different training environments without boilerplate code.

OpenAI Announces over 300 Applications are now Using Their Commercial GPT-3 Offering

OpenAI's commercial GPT-3 service launched 9 months ago and lets applications generate plausible text given a prompt for a fee per request. The 300 applications range from copy text generation, games, question answering for chatbots, sentiment analysis with very high fidelity, and much more. Many of the applications are startups that are not making money yet, and it remains to be seen how many of them ever will, but this is surely a quickly developing field with great potential.

The EleutherAI Project Releases a 6B GPT-3-Style Model as Open Source

EleutherAI, a group of unaffiliated researchers, have released a 6 billion parameter
GPT-3-like model as open source. For comparison, GPT-3 comes in variants with about 2.7, 6, 13, and 175 billion parameters. With efforts like this succeeding, it seems we can put the worry that only the largest companies will afford to compete in the AI space behind us.

The Language Model AI Revolution is Coming to Search Applications

Over the last two years, large transformer models have been applied by researchers to handily beat traditional methods in relevance benchmarks. However, these methods have been too costly to be usable in real applications. This has finally changed with this open-source work which has managed to apply these methods while keeping cost and latency low.

Advances in Computer Vision: Better Performance at Half the Cost

In a new paper, Google presents a vision model which scores a record 90.35% on ImageNet. More interesting than better performance is that this model is sparse - using transformers instead of convolutions, and therefore much cheaper to evaluate. Inference cost is what is stopping many businesses from applying computer vision and this may soon be set to change.

Hungry For More Content On AI & ML?

Subscribe to Infoshare Monthly Newsletter for developers to stay up-to-date with Wojtek’s, Jon's and other experts’ recommendations on Java & backend, AI & ML, Cloud & DevOps, and IT leadership.

Tags:

#SOFTWARE DEVELOPMENT

LATEST NEWS

Od czego zależy sukces wdrożeń AI? - polskie i amerykańskie trendy w branży tech 24.10.2025

Five highlights from EU Space Days 2025 13.06.2025

🤝 Networking i zabawa na Infoshare 2025 30.04.2025

⭐ Spotkaj liderów innowacji | Keynote Speakers 23.04.2025

🎸 Zagraj na Great Networking Party | Call for Bands 16.04.2025

🏆 Gdańsk Startup Award – Twoja Szansa na Sukces! 09.04.2025