Member-only story
Diary Entry — Aug. 4th, 2024
Today, I finished another part of the CS50x lecture for week 0. Things are getting pretty interesting.
Disclaimer: The layout of this article does not represent the actual order in which any content made by others was presented.
Today, I finished another part of the CS50x lecture for week 0. Things are getting very interesting. It turns out, the representation of numbers through binary digits is only the most straightforward application of standard representation systems. Now, there’s also letters, colors, and even a general discussion of how picture, audio and video might be represented!
Week 0 Continued: Back to Lecture
ASCII
Professor David J. Malan, the instructor of the course (who appears in the picture above) moves on from numbers to ask the question: how do you represent letters? The question seemed a little difficult to me because letters aren’t numbers, and professor Malan mentions that there are no more building blocks available; just numbers represented in binary form.
It turns out, these numbers themselves can assigned letters, allowing the interpretation of these numbers as letters within context. The first standard that was set in order to achieve that was, unsurprisingly, American: the American Standard Code for Information Interchange (ASCII). In this standard, the letters of the English alphabet, along with non-letter characters such as punctuation symbols, decimal digits and some weirdness too (apparently called control codes) are each assigned their own number. However, professor Malan quickly points out that the eight bits, or one byte, used in ASCII to represent a character do not provide nearly enough numbers to represent the characters of other languages, such as accented characters in other Latin-based languages like Spanish and Portuguese, characters from Arabic scripts such as in Persian and Arabic, and non-alphabetical characters such as in the Chinese languages.
Unicode
The solution that exists for this, professor Malan explains, is based on ASCII, and is called Unicode. He says Unicode’s mission was to preserve all human languages past, present and future. To accomplish…