Discover more from Mostly Harmless Ideas
Introduction to Computer Science
A high-level overview of the whole field of Computer Science.
Computer Science is a relatively new field that emerged in the early 20th century from a massive effort to formalize Mathematics. However, it wasn't until the mid-60s and early 70s that it gained recognition as a distinct and unique scientific field. This was achieved by establishing the theoretical grounds of computability and complexity theory, which proved there were many interesting, novel, and challenging scientific questions involving computation.
Computer Science is an extensive field that encompasses a wide range of knowledge and skills. It goes from the most abstract concepts related to the nature of computable problems to the most practical considerations for developing useful software. To solve problems in this field, one must navigate through various levels of abstraction, making it challenging to provide a concise summary that the average reader can easily understand.
In this series, I will introduce you to the vast and exciting world of Computer Science. We will begin by examining the numerous interconnected subfields of CS at a macro level. We will delve into specific fields, concepts, and even practical techniques and algorithms in future posts. I hope you enjoy the ride, and maybe, if I’m persuasive enough, you will decide that you want to master Computer Science.
A Map of Computer Science
Computer Science encompasses a vast range of subjects, incorporating multiple areas of math and engineering. It also introduces many original concepts and ideas. Thus it is impossible to divide this field objectively into subfields or separate areas with precise borders. I will attempt to categorize it into somewhat cohesive disciplines, but keep in mind there is considerable interconnectedness between them.
In the following sections, I will lay out what I believe are the most fundamental concepts and problems in Computer Science. Whenever possible, I will provide a link to further reading, most commonly to a relevant Wikipedia article, which in these topics is a highly accurate and accessible resource. I will also provide links to the other posts in this series in bold format.
Let's start by looking at the foundations. The fundamental question of Computer Science is something akin to "What kinds of problems can be solved with algorithms." It was first asked formally by David Hilbert in 1901 and first answered by Alan Turing in 1936. This central question of computability theory establishes the foundational theory for what can be done with any computer.
The core concept in computability is the Turing machine, an abstract computer model that we define to study the properties of computation and computable problems irrespective of any concrete machine our current technology supports. Turing machines let us answer which kinds of problems are, in principle, undecidable —which means that no algorithm can ever be devised to solve them completely. Surprisingly, there are many of those, not all esoteric; there are very practical problems in Computer Science that we know are mathematically impossible to solve. Turing machines also give a precise definition of algorithm.
Once we lay out the limits of computation, we can ask which computable problems are easier or harder. Complexity theory deals with the complexity of different problems. It asks how efficiently, most commonly in terms of computing time and memory, we can solve any problem. For some problems, we can even prove that we have found the most efficient algorithm anyone could ever develop. The most important problem in complexity theory, and probably in the whole of Computer Science, is the famous question of whether P=NP, which is ultimately a question about the nature of really hard problems.
The final ingredient in the foundational theory of Computer Science is the relationship between formal languages and automata. Formal language theory allows us to formalize the notion of a language, whether it is a human language like English or Spanish, a programming language like Python or C#, and any technical, abstract, or mathematical notations used in many fields.
The basic concept in formal language theory is formal grammar, a formal definition of what can be said with a specific language. In contrast, automata theory deals with abstract machines —like Turing machines— of different levels of complexity that can recognize specific families of formal languages. The study of the relationship between language generation and recognition led to the invention of compilers, which are computer programs that understand specific programming languages and translate them to machine code that can be run on concrete hardware.
Computer Science emerged from Mathematics, and therefore, it adopted the mathematical practice of establishing axioms and demonstrating theorems. Besides, when tackling real-world computational challenges, we frequently apply mathematical techniques used in engineering and other fields of science. Consequently, a substantial amount of mathematical knowledge underpins Computer Science, albeit with a computational perspective.
Logic forms the bedrock of Computer Science as it gave birth to its fundamental principles. By enabling us to formalize reasoning algorithmically, Logic allows us to create programs that can manipulate logical formulas and prove theorems and also lets us formally prove that an algorithm works as intended. Discrete Math is a natural extension of Logic, encompassing a vast mathematical field that studies discrete objects such as collections of natural numbers. It formalizes numerous basic computational concepts like graphs and trees. Additionally, Discrete Math lies at the core of modern cryptography, which relies heavily on number theory.
Algebra and calculus are fundamental math subfields used in most engineering and scientific disciplines, and even more so in Computer Science. Algebra deals with mathematical structures such as vectors, matrices, vector spaces, and operations between them. Many abstract algebraic operations can be interpreted as some form of data manipulation. Thus they appear at the core of many CS applications, from computer graphics to compression to search engines to machine learning. Algebra also strongly connects with computational geometry, the branch of geometry that studies computational approaches to 2D and 3D modeling, computer-aided design (CAD), and many related applications.
Calculus deals with the relationship between changing quantities. It enables us to determine how to adjust the input to achieve a desired output, even when their relationship is fairly complex. If you go deeper into the formal properties of these relations, you will find the field of mathematical analysis. Two key concepts in calculus and analysis are differentiation and integration, and they have numerous applications in Computer Science, particularly in continuous optimization, which also underpins many modern machine learning methods. An interesting middle ground between algebra and calculus is the field of differential equations, which helps to model complex phenomena, from weather to engines to epidemics, where the rates of change among different quantities depend on each other. Mathematical analysis also provides the foundations for the fields of probability and statistics.
Probability is the branch of mathematics that studies reasoning under uncertainty. In a sense, it’s an extension of Logic when we consider that statements can not only be true or false, but they have some degree of likelihood. A related field is Statistics, which involves understanding and extracting insights from —often uncertain— data. Both fields are extremely relevant to data science and intersect with Computer Science in many tasks, from analyzing real-world data to designing programs that can reason and make decisions under uncertainty, such as autonomous vehicles or virtual agents.
However, continuous math (including algebra, calculus, probabilities, and statistics) assumes real numbers with infinite precision, but in physical computers, all we can hope for is finite approximations. Thus, every mathematical operation performed in a computer can involve some approximation errors. Dealing with this problem robustly and efficiently is surprisingly hard. It is the purpose of numerical analysis, an essential tool to avoid catastrophic errors from misunderstood approximations. It also provides efficient algorithms for many abstract operations from algebra and calculus.
Beyond the theoretical foundations, solving a concrete Computer Science problem will often require using one or more algorithms. Hence, studying which specific algorithm can solve which concrete problem is a big part of Computer Science. Some of the most basic and commonly used algorithms you’ll encounter are devoted to solving problems on collections of items, such as lists of numbers or sequences of letters. Two basic data structures used throughout Computer Science are arrays and strings.
Strings are sequences of symbols —called characters— most commonly used to represent text. String matching, i.e., the problem of finding if a given pattern appears in a text, is one the most studied problems in CS with applications everywhere from compiler design to DNA analysis to chatbots to search engines.
An array is a collection of items of an arbitrary type —e.g., numbers, words, images, or custom user data records. The most basic operations in arrays are searching and sorting, which give rise to some of the most clever algorithms ever designed. On top of arrays, we find many data structures that focus on efficiently performing some particular operations on specific types of objects. These include lists, queues, stacks, heaps, and the two fundamental ways to represent data in Computer Science: trees and graphs.
A tree is a hierarchical collection of items in which every item has a parent and potentially many children. It is used anywhere we need to represent intrinsically hierarchical data, such as organizations of people. However, trees also appear in many problems where the hierarchy is not obvious, such as evaluating mathematical expressions or searching large mutable collections of objects.
Graphs are the most general data structure to represent abstract information computationally. A graph is a collection of nodes connected by edges. It can represent anything network-like, from geospatial data to social networks to abstract things like the dependencies between subtasks in a given problem. Graph algorithms are so vast that you can find whole subfields dedicated to just a subset of them. Still, the most basic one is path-finding, i.e., finding a path —often optimal in some sense— between one specific source node and one or more destinations.
However, just memorizing a huge toolbox of algorithms is not enough. Sometimes you’ll find problems for which no one has already designed an algorithm. In these cases, you’ll need algorithm design strategies, which are higher-level ideas and patterns useful for creating new algorithms.
The most powerful such strategy is recursion, so much so that “you can do anything with recursion” is a popular meme in Computer Science. Recursion is all about solving a problem by decomposing it into subproblems that are easier to solve. When taken to its extreme, this idea allows us to formalize the whole theory of computability, which means that any algorithm in Computer Science is, at its core, recursive. There are many recursive patterns; some of the most commonly used are tail recursion, divide and conquer, and backtracking.
Other powerful design strategies underpin many of the most successful algorithms in Computer Science. Greedy algorithms, for example, are the ones that make the best short-term choice. While this is often suboptimal, there are surprisingly many cases where a carefully chosen local optimum leads to a global optimum. Dynamic programming is another such strategy. Under the right circumstances, it allows transforming a slow recursive algorithm into a very fast non-recursive version while retaining all the correctness guarantees.
Mastering algorithms is not enough, though. Ultimately, we need a physical device that can run those algorithms and give us a result at a reasonable cost. Real computers are the physical embodiment of Turing machines, but the devil is in the details. Building a real computer is far more involved than just wiring some cables. Even assuming, as we do in CS, that electronic components always work as expected, the design of resilient computational systems is a whole discipline.
At the lowest level, we have transistors, microscopic circuitry that performs a very simple operation: an electronic current breaker. We can design logical circuits —combinations of electronic components that perform whatever logical operation we desire— with just a few transistors. We can design circuits to add, multiply, or store values to build a Central Processing Unit (CPU) connected to a Random Access Memory (RAM) bank. This is the basic computer, which can be programmed with very simple languages called Assembly Languages. It is at this level that the frontier between hardware and software melds.
Modern computers have increasingly more complex components, including external memory drives and peripherals like displays and keyboards. Keeping all of those components in harmony is the job of the Operating System. OSes like Windows and Linux do much more, though. They provide an abstraction layer over common operations that many programs need, including interfacing with hardware peripherals —e.g., playing sounds or capturing video— or writing and reading from files. Two key abstractions are processes, which allow each program to behave as if it were the only one running, and threads, which enable concurrency even if you have a single physical CPU.
But if one computer is incredible, the real magic begins once you have two or more working together. Enter the world of computer networking. The first problem we must solve is simply getting two computers on both sides of the planet, talking reliably to each other over a noisy, error-prone communication channel. The TCP/IP protocol is a collection of algorithms that guarantee communication even when intermediary steps fail constantly. It's the software backbone of the Internet, possibly the single most important technological advance after the first industrial revolution.
Distributed systems allow multiple computers to communicate and collaborate, creating larger organizational levels of computational systems. The simplest setup is the client-server architecture, where an application is split between the user and the service provider. As these systems scale up, they require coordination among several computers to maintain reliability even when some individual components fail. To achieve this, consensus algorithms are used for data distribution, monitoring system health, and error correction tasks.
Some of today's largest software platforms, like the Google Search Engine and Amazon Web Services, rely on distributed systems as their foundation. These infrastructures are often called cloud computing —vast networks of computers working together as one unified computing platform.
A crucial component in computational systems is security, especially when valuable information is stored on computers connected to the internet. Ensuring reliable and secure communication between computers is a major challenge in computer science, addressed with the help of cryptography.
Cryptography has its roots in breaking machine codes during World War II, such as the Enigma code deciphered by Alan Turing. Modern cryptography relies on symmetric and asymmetric or public-key methods, which enable two actors to establish a secure channel over an insecure communication medium without ever meeting each other. This principle forms the backbone of most online interactions, from reading your email to using your bank account to chatting with another person so that no one can access your data, often not even the service provider.
Building software that works is extremely hard. It's not enough to master all the algorithms, data structures, theorems, and protocols. Software products have unique qualities among all other engineering products, especially regarding flexibility and adaptability requirements. Any software that lives long enough will have to be changed in ways that weren't predictable when the software was first conceived. This poses unique challenges to the engineering process of making software, the domain of software engineering.
Software construction starts with a programming language, of which there are plenty of variants. Some programming languages are designed for specific domains, while others are called general purpose and designed for any programming task. A fundamental distinction is between imperative and declarative languages. The most influential paradigms are functional (often declarative) and object-oriented (most commonly imperative) programming models.
Programming languages are tools, so their effectiveness in solving a concrete problem relies heavily on applying best practices. Software engineering also deals with finding principles and practices to use better the available tools, including technical tools and human resources. The most important set of software engineering principles is the SOLID principles. Other principles, such as DRY and YAGNI, emphasize specific, pragmatic mindsets about software development. Additionally, we have dozens of design patterns: concrete reusable solutions to common software development problems.
Beyond the actual code, the most crucial resource most applications must manage is data. As the necessity to organize, query, and process larger and larger amounts of data increases, we turn from classic in-memory data structures to more advanced long-term storage solutions: databases. The relational database is the most pervasive model for representing and storing application data, especially structured data. However, several alternative non-relational paradigms are gaining relevance as less structured domains (such as human language and images) become increasingly important.
In the data-centric software world, information systems are some of the most complex applications we can find. They provide access to vast amounts of information, which can be from a concrete domain —such as medical or commercial information— or general purpose like search engines. These systems combine complex algorithms and data structures for efficient search with massively distributed hardware architectures to serve millions of users in real-time.
As we've seen, software ultimately runs on someone's computer, and different computational systems pose different challenges. From a software engineering perspective, we can understand those challenges better if we think in terms of platforms. The three major software platforms are the desktop (apps installed in your computer), the mobile (apps installed in your smartphone), and the web (apps running on your browser).
The final component in software engineering concerns the interaction between the user and the application. Human-computer interaction (HCI) studies the design of effective user interfaces, both physical and digital. Two major concerns are designing an appropriate user experience, and ensuring accessibility for all users, fundamentally those with special needs such as limited eyesight, hearing, or movement.
Artificial Intelligence (AI) is a broad and loosely defined umbrella term encompassing different disciplines and approaches to designing and building computational systems that exhibit some form of intelligence. Examples of machines exhibiting intelligent behavior range from automatically proving new theorems to playing expert-level chess to self-driving cars to modern chatbots. Accurately defining what we mean by machine intelligence is extremely difficult and out of the scope of this post, but ultimately it revolves around the capacity to solve hard problems with as little explicit guidance as possible. We can cluster the different approaches in AI into three broad areas: knowledge, search, and learning.
Knowledge representation and reasoning studies how to store and manipulate domain knowledge to solve reasoning and inference tasks, such as medical diagnosis. We can draw from Logic and formal languages to represent knowledge in a computationally convenient form. Ontologies are computational representations of the concepts, relations, and inference rules in a concrete domain that can be used to infer new facts or discover inconsistencies automatically via logical reasoning. Knowledge discovery encompasses the tasks for automatically creating these representations, for example, by analyzing large amounts of text and extracting the main entities and relations mentioned. Ontologies are a special case of semantic networks, graph-like representations of knowledge, often with informal or semi-formal semantics.
Search deals with finding solutions to complex problems, such as the best move in a chess game or the optimal distribution of delivery routes. The hardest search problems often appear in combinatorial optimization, where the space of possible solutions is exponential and thus unfeasible to explore completely. The most basic general search procedures —Depth-First Search (DFS) and Breadth-First Search (BFS)— are exact, exhaustive, and thus often impractical. Once you introduce some domain knowledge, you can apply heuristic search methods, such as A* and Monte Carlo Tree Search, which avoid searching the entire space of solutions by cleverly choosing which solutions to look at. The ultimate expression of heuristic search is metaheuristics —general-purpose search and optimization algorithms that can be applied nearly universally without requiring too much domain knowledge.
Machine learning enables the design of computer programs that improve automatically with experience and is behind some of the most impressive AI applications, such as self-driving cars and generative art. In ML, we often say a program is “trained” instead of explicitly coded to refer to this notion of learning on its own. Ultimately, this involves finding hypotheses that can be efficiently updated with new evidence.
The three major paradigms in this field are supervised, unsupervised, and reinforcement learning. Each case differs in the type of experience and/or feedback the learning algorithm receives. In supervised learning, we use annotated data where the correct output for each input is known. Unsupervised learning, in contrast, doesn’t require a known output; it attempts to extract patterns from the input data solely by looking at its internal structure.
Reinforcement learning involves designing systems that can learn by interaction via trial and error. Instead of a fixed chunk of data, reinforcement learning places a learning system —also called an agent in this paradigm— in an environment, simulated or real. The agent perceives the environment, acts upon it, and receives feedback about its progress in whatever task it is being trained on. When the environment is simulated, we can rely on different paradigms, such as discrete events or Monte Carlo simulations, to build a relatively realistic simulation. Reinforcement learning is crucial in robotics and recent advances in large language models.
All of the above are general approaches that can be applied to various domains, from medical diagnosis to self-driving cars to robots for space exploration. However, two domains of special importance that have seen massive improvements recently are vision and language. The most successful approaches in both fields involve using artificial neural networks, a computational model loosely inspired by the brain that can be trained to perform many perceptual and generative tasks. ANNs draw heavily from algebra, calculus, probability, and statistics, representing some of the most complex computer programs ever created. The field of neural networks is alternatively called deep learning, mostly for branding purposes.
Computer vision deals with endowing computational systems with the ability to process and understand images and videos for tasks like automatic object segmentation and classification. Classic approaches to computer vision rely heavily on signal processing algorithms stemming from algebra and numerical analysis. Modern approaches often leverage neural networks, most commonly convolutional networks, loosely inspired by the visual parts of animal brains.
Natural language processing (NLP) enables computational systems to process, understand, and generate human-like language. It encompasses many problems, from low-level linguistic tasks, such as detecting the part-of-speech of words in a sentence or extracting named entities, to higher-level tasks, such as translation, summarization, or maintaining conversations with a human interlocutor. The most successful approaches in modern NLP leverage transformer networks, a special type of neural network architecture that can represent some forms of contextual information more easily than other architectures. These are the mathematical underpinnings behind technologies like large language models and chatbots.
Computer science is a vast field encompassing both theoretical and practical aspects. It goes from the most abstract computational concepts grounded in logic and mathematics to the most pragmatic applications based on solid engineering principles to frontier research like building intelligent systems. Working in CS also involves navigating complex social interactions between developers, users, and society and requires awareness of the many ethical issues that automation and computation pose.
Mastering computer science is a significant challenge, making it one of the most demanding academic pursuits. However, this discipline offers immense fulfillment since computation lies at the heart of our modern world. Software pervades all sectors and industries. Those proficient in computer science can find work across various disciplines and not only by creating commercial software. Across all scientific disciplines, from understanding the frontiers of space to solving climate change, designing new materials, and extending human life, every challenging problem in every scientific field has computational aspects that require collaboration between computer scientists and experts from that domain.
With this post, I hope to have ignited your curiosity in exploring the beautiful ideas at the core of computer science and its intersection with other fields. If that's the case, then I invite you to stay in touch. I'll be back with more Computer Science topics very soon.