77 Stories To Learn About Distributed Systems

cover
24 Apr 2023

Let's learn about Distributed Systems via these 77 free stories. They are ordered by most time reading created on HackerNoon. Visit the /Learn Repo to find the most read stories about any technology.

1. How Shift-Right Testing Can Build Product Resiliency

Shift-right testing improves product resiliency by uncovering issues that surface under heavy user traffic and are difficult to simulate in test environments.

2. Best Practices For Apache Kafka Configuration

Having worked with Kafka for more than two years now, there are two configs whose interaction I've seen be ubiquitously confused.

3. A Look at the Top Questions for a System Design Interview at Facebook

Facebook System Design Interview; Design Facebook NewsFeed; Design Status Search; Design Live Commenting; Design Facebook Messenger / WhatsApp; Design Instagram

4. Working With Transient Errors

Each remote service that we call eventually going to fail. No matter how reliable they are, it is inevitable.

5. How to Use Cloud-init to Self-Register k3OS Clusters to Rancher

The lightweight Kubernetes OS that is known as k3OS has quickly been gaining popularity in the cloud-native community as a compact and edge-focused Linux distribution that cuts the fat away from the traditional K8s distro. While k3OS is picking up steam, it is still on the bleeding edge and there is still a bit of a shortage of learning material out there for it.

6. Replicate PostgreSQL Databases Using async Python and RabbitMQ for High Availability

PostgreSQL replication using python and RabbitMQ for providing your database server with High Availability by easily making replicas of your master server.

7. Federated Learning: A Decentralized Form of Machine Learning

Major companies using AI and machine learning now use federated learning – a form of machine learning that trains algorithms on a distributed set of devices.

8. EdTech: How Dutch Secondary Schools Are Transforming With Tech

It's been 2 years since I joined a Dutch EdTech company as Lead Dev and in this article I'll explain how we are transforming communication for Dutch schools.

9. An Intro to DisCO by the DisCO CAT: Color me Hopeful

DisCO is a cooperative, feminist economic, commons-oriented and P2P way of working and an alternative to DAOs.

10. Hackable Blockchains Simulations

Distributed consensus simulation visualization, DYOR

11. Promises, Microservices, and Intent

Last year, after a bit of wrangling and lots of editing by the fantastic Jenn Webb, O’Reilly published a discussion Mark Burgess and I had on one of his trips through the Valley as a podcast.

12. How Optimizing Outbound Logistics Saves Your Business Time and Money

Outbound logistics play a critical role in a company's overall supply chain management and can significantly impact its bottom line.

13. Common Design Patterns for Building Resilient Systems (Retries & Circuit Breakers)

We talk about two design patterns that highlight best practices for building resilient microservices architectures at scale.

14. Distributed Computing: Illusion of Single System

What is common between streaming movies on Netflix, searching for information on Google, buying clothes on Amazon? You interacted with data services built on distributed systems. You interact with the largest distributed system daily: the Internet.

15. How To Deal with Global Crisis: A Better “Truth Machine”

One of the most terrifying parts of the current crisis is uncertainty. Uncertainty is one of the most terrifying things people can experience in general. Absolutely everyone I have spoken to is absolutely convinced that a lot of the information available is either biased, doctored or flat-out false.

16. Microservices? Why Not!

The cost of microservices from a developer's perspective.

17. Using Web3 to Detangle the World’s Supply Chain

The global supply chain is in a gridlock. Let's fix that.

18. Super Duper SQL Tips for Software Engineers

In this post, we will talk about the features of working with SQL. We will talk about how you can possibly improve your database queries and speed up your app

19. Fixing The ClickHouse Node Failure On Distributed Systems - A How-To Guide

Part One: ClickHouse Failures, by Marcel Birkner

20. When to Use the Fluence Protocol

how Fluence network enables creative apps on an example of surprise party planning app

21. How to Avoid Inconsistency Across Microservices

In a microservice architecture, you can get dependencies that impose restrictions on the services used

22. Failure Modes: Why You Need To Know Them

What are bimodal failure modes and how to avoid them

23. Is Kafka the Key? The Evolution of Highlight's Ingest

Building a distributed message processing queue using Apache Kafka requires some thought. We walk through how we process thousands of large messages per second.

24. What Are Conflict-free Replicated Data Types (CRDTs)?

In a world where most of the apps that we use on the internet are collaborative in nature, conflicts in data are common. Is there a way to avoid it?

25. The Client Server Model: Breaking Free with IPFS

This is a condensed version of this post on the Client-Server Model. We use a client, such as a web-browser or chat app, and communicate with a single entity.

26. The Realtime API Family [A Deep Dive]

Looking to 2020 and beyond, the proportion of data produced and consumed in realtime is growing exponentially. IDC predict that by 2025 1/3 of all data produced globally will be realtime.

27. How to Use the Whiteboard in System Design Interviews 

You'll likely be asked some system design questions when interviewing at many tech companies today. Here's how to use the whiteboard to answer them effectively.

28. My System Design Interview Checklist in 8 Simple Steps

That dreaded system design interview. I remember the first system design question I was asked. “Design WhatsApp”, he said. I didn’t know where to start! I was a fresher. Data structures and algorithms were the only things I knew. I am sure you can guess how that interview went. Then after enough research, I made myself a checklist of components, of sorts, to navigate me through my next system design interviews. And I sh*t you not, it works!

29. Interplanetary Versioned File System

IPVFS: A light weight version control system for files on the Interplanetary File System.

30. How to Launch Your Own Blockchain: Mainnet Launch [Part VI]

I II III IV V

31. Kafka Storage Design  -  Making File Systems Cool Again!

What makes Kafka so Fast? A Deep Dive into Kafka Storage Internals.

32. Choosing Between Enterprise Messaging and Event Streaming For Your Architecture

Comparing Enterprise messaging and event streaming across different dimensions to see how they excel at solving different but related messaging problems

33. How to Launch Your Own Blockchain: Game of Validators [Part V]

In some blockchains validators are pre-defined, in others independent teams and individuals  own the nodes. Game-based approach is an excellent way to choose validators wisely.

34. Producers Guarantees for Event-driven Development

Instead of consumers' delivery guarantees in message queues, in this article, we're going to talk about producers' guarantees in distributed systems.

35. Build a Self-Hosted Online Second Brain Like Evernote

How to use Platypush and other open-source tools to build a notebook synchronized across multiple devices

36. Microservice Observability Patterns [Part 2]

In my previous article, I talked about the importance of logs and the differences between structured and unstructured logging. Logs are easy to integrate into your application and provide the ability to represent any type of data in the form of strings.

37. How to Launch Your Own Blockchain: Mainnet Support

The main network is running, transactions are being sent, the wallet is working. What's next? In this article, we will consider how to maintain a network and solve its problems.

38. 5 Books You Can Read to Boost Your Computer Science Knowledge

Make use of your downtime and read something good!

39. Distributed Governance and Anonymity: A Bad Idea

One of the big debates in the Genesis DAO started by DAOstack was the question of anonymity. Should people be able to make proposals and ask for budgets without providing a real identity?

40. Why Use Kubernetes for Distributed Inferences on Large AI/ML Datasets

This blog provides you with some strong rationale to use Kubernetes on large AI/ML datasets on which distributed inferences are performed. Loop in for more.

41. Exploring the CAP Theorem: The Ultimate Battle of Trade-Offs in Distributed Systems

Consistency, availability, and partition tolerance are the three musketeers of distributed systems. They ensure that your system operates correctly.

42. Dealing With Replication, High-Performance Queries And Other Data Platforms Challenges

Many products solve for global issues and load balancing but unless a platform is built from the ground up with the necessary backbones, it becomes a nightmare to manage.

43. An Introduction to Blockchain + NoSQL Databases

Both NoSQL databases and modern Blockchain ledgers benefit from a set of common principles. When they are both implemented for an application, a lot can be accomplished as the platforms can complement each other.

44. Efficient Model Training in the Cloud with Kubernetes, TensorFlow, and Alluxio Open Source

This article presents the collaboration of Alibaba, Alluxio, and Nanjing University in tackling the problem of Deep Learning model training in the cloud. Various performance bottlenecks are analyzed with detailed optimizations of each component in the architecture. This content was previously published on Alluxio's Engineering Blog, featuring Alibaba Cloud Container Service Team's case study (White Paper here). Our goal was to reduce the cost and complexity of data access for Deep Learning training in a hybrid environment, which resulted in over 40% reduction in training time and cost.

45. Accelerating Analytics by 200% with Impala, Alluxio, and HDFS at Tencent

This article describes how engineers in the Data Service Center (DSC) at Tencent PCG (Platform and Content Business Group) leverages Alluxio to optimize the analytics performance and minimize the operating costs in building Tencent Beacon Growing, a real-time data analytics platform.

46. Great Options for Generating Passive Income With Ethereum 2.0

In this article, we explore custodial, semi-custodial, and non-custodial staking services and review the industry's leading non-custodial protocols for ETH 2.0

47. Distributed Governance and Anonymity: A Bad Idea

One of the big debates in the Genesis DAO started by DAOstack was the question of anonymity. Should people be able to make proposals and ask for budgets without providing a real identity?

48. Introduction To Distributed Tracing Pattern

A distributed architecture brings in several challenges when it comes to operability and monitoring. Here, one may be dealing with tens if not hundreds of microservices, each of which may or may not have been built by the same team.

49. Alluxio Accelerates Deep Learning in Hybrid Cloud using Intel’s Analytics Zoo powered by oneAPI

This article describes how Alluxio can accelerate the training of deep learning models in a hybrid cloud environment when using Intel’s Analytics Zoo open source platform, powered by oneAPI. Details on the new architecture and workflow, as well as Alluxio’s performance benefits and benchmarks results will be discussed. The original article can be found on Alluxio's Engineering Blog.

50. Revising The Basics of Blockchain Part 1: Introduction

A very Beginner-friendly Guide to Understanding the Blockchain (Part 1: Introduction to Blockchain Technology)

51. Blockchain as a Distributed File System: How Would It Work?

In this article, we propose a blockchain network that acts as a centralized append-only distributed file system (DFS) such as Hadoop Distributed File System (HDFS) or Google File System (GFS). The potential advantages of blockchain as a distributed file system (BaaDFS) include:

52. AvionDB Introduction: A MongoDB-like Distributed Database

In the past few months we have been getting this question a lot:

53. Serving Structured Data in Alluxio

This article introduces Structured Data Management (Developer Preview) available in the latest Alluxio 2.1.0 release, a new effort to provide further benefits to SQL and structured data workloads using Alluxio. The original concept was discussed on Alluxio’s engineering blog. This article is part one of the two articles on the Structured Data Management feature my team worked on.

54. AvionDB: A MongoDB-like Distributed Database

In the past few months we have been getting this question a lot:

55. Step-By-Step Tutorial To Deploy A Distributed Node.js App At The Edge

In this tutorial, we are going to demonstrate how to deploy a distributed Node.js app at the Edge using Section's Edge Compute Platform.

56. Serving Structured Data in Alluxio: Example

In the previous article, I described the concept and design of the Structured Data Service in the Alluxio 2.1.0 release. This article will go through an example to demonstrate how it helps SQL and structured data workloads.

57. 10 Steps To Digital Transformation While Simultaneously Cutting Costs

Companies Must Transform Or Else (Photo by eelnosiva on Adobe)

58. A Quick Primer on Everything You Need to Know About Blockchain

Blockchain is a term utilized to represent distributed ledger technology.

59. How to Choose the Right Database for your Requirements

Imagine — You’re in a system design interview and need to pick a database to store, let’s say, order-related data in an e-commerce system. Your data is structured and needs to be consistent, but your query pattern doesn’t match with a standard relational DB’s. You need your transactions to be isolated, and atomic and all things ACID… But OMG it needs to scale infinitely like Cassandra!! So how would you decide what storage solution to choose? Well, let’s see!

60. Serving Structured Data in Alluxio: Example

In the previous article, I described the concept and design of the Structured Data Service in the Alluxio 2.1.0 release. This article will go through an example to demonstrate how it helps SQL and structured data workloads.

61. Did you know you could write scripts with webservices? You do now.

There's a big hole in reusability on the web. An entertaining statistic - not the most accurate but still fascinating - was generated by Simon Wardley from a Twitter poll. He calculated that basic user registration had been written over a million times. The average developer had written user registration about 5 times. I'm sure you've built it a few times yourself.

62. How to Route Traffic Between Microservices During Development

Route traffic between microservices during development with this one simple trick that will save you setup time and, well, headache.

63. Rethinking Programming: The Network in the Language

With the emergence of microservices architecture, applications are developed by using a large number of smaller programs. These programs are built individually and deployed into a platform where they can scale independently. These programs communicate with each other over the network through simple Application Programming Interfaces (APIs). With the disaggregated and network distributed nature of these applications, developers have to deal with the Fallacies of Distributed Computing as part of their application logic.

64. Embrace the Chaos, Randomness and Uncertainty on Your Path to Engineer Better Software

Chaos engineering is the practice of deliberately injecting an error into a system, in order to observe, in vivo, the consequences.

65. #NoBrainers: You Need A High Performing Low Latency Distributed Database

Certain industries greatly benefit from high-performing, low-latency, geo-distributed technologies.

66. Data Location Awareness: The Benefits of Implementing Tiered Locality

Tiered Locality is a feature led by my colleague Andrew Audibert at Alluxio. This article dives into the details of how tiered locality helps provide optimized performance and lower costs. The original article was published on Alluxio’s engineering blog

67. Decentralized Databases Reduce Data Latency With Geographically Distributed Data Centers

Latency is caused by offloading processing from an app to an external server. But what if there was a solution to the monolithic common single-cloud geography?

68. Kafka Basics and Core Concepts: Explained

In this article we will cover the core concepts of Kafka and also will touch upon a few of the advanced topics.

69. Delta Compression: Diff Algorithms And Delta File Formats [Practical Guide]

A diff algorithm outputs the set of differences between two inputs. These algorithms are the basis of a number of commonly used developer tools. Yet understanding the inner workings of diff algorithms is rarely necessary to use said tools.

70. Working Towards a Sustainable Ecosystem Beyond Tokenization

Many industries are on the brink of the next technological revolution in record keeping. Ten years after Bitcoin made its splash, we’re seeing many inspired by some of the benefits promised by the technology outside of the money use case:

71. How To Optimize Large S3 API Costs using Alluxio

This is a guest blog contributed by datasapiens’ Juraj Pohanka, Koen Michiels and Sam Gilbert. This article described how engineers at datasapiens brought down S3 API costs by 200x by implementing Alluxio as a data orchestration layer between S3 and Presto.

72. HarperDB is More Than Just a Database: Here's Why

HarperDB is more than just a database, and for certain users or projects, HarperDB is not serving as a database at all. How can this be possible?

73. Enhancing Bitcoin's Transaction Privacy With Bloom Filters

Bloom filters are a data structure developed by Burton Howard Bloom in 1970. You can see them as a hash tables’ cousin. They also allow for efficient insert and lookup operations while occupying very little space

74. Building Microservices With Nameko

What is Nameko? Nameko is a framework for building lightweight, highly scalable and fault-tolerant service in Python.

75. Connecting the Dots: FLP, BFT & Consensus Algorithms

When learning about blockchain consensus algorithms and distributed systems in general, you will inevitably come across terms like FLP impossibility and Byzantine fault tolerance. While there is plenty of literature on these subjects, it often suffers from a narrow focus, failing to explain the connections and relationships between them. Furthermore, much of the existing literature gives either too much or not enough technical detail — I found this to be especially true when learning about consensus algorithms like the proof of stake.

76. Challenges and Opportunities of Serverless in 2021

Going serverless has many benefits, but it's not without its issues. Learn about the most common serverless challenges & how to overcome them.

77. Block-Less Blockchains: Understanding Directed Acyclic Graphs or DAGs

Blockchain 3.0 will be upon us very soon, With Ethereum and so many other blockchain networks fighting for this, can directed acyclic graphs be the future?

Thank you for checking out the 77 most read stories about Distributed Systems on HackerNoon.

Visit the /Learn Repo to find the most read stories about any technology.