Jekyll2023-03-14T12:52:34+01:00/feed.xmlYehonathan SharvitDeveloper. Author. Speaker.{"image"=>"authorimage.jpg", "greetings"=>"Hi there! My name is Yehonathan Sharvit. I'm a software developer, author and speaker. My passion is to make interesting things easy to understand. I hope you will enjoy the articles."}Separate data schema from data representation2022-06-22T04:35:24+02:002022-06-22T04:35:24+02:00/databook/2022/06/22/data-validation<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>With data separated from code and represented with generic and immutable data structures, now comes the question of how do we express the shape of the data? In DOP, the expected shape is expressed as a data schema that is kept separated from the data itself. The main benefit of Principle #4 is that it allows developers to decide which pieces of data should have a schema and which pieces of data should not.</p>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/ajv/6.12.6/ajv.bundle.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.4/lodash.min.js" integrity="sha256-8E6QUcFg1KTnpEU8TFGhpTGHw5fJqB9vCms3OhAYLqw=" crossorigin="anonymous"></script>
<script>
window.ajv = new Ajv();
</script>
<div class="paragraph">
<p>This article is an exploration of the fourth principle of Data-Oriented Programming. The other principles of DOP are explored here:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a>: Separating code (behavior) from data.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>: Representing data with generic data structures.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/immutable-data.html">Principle #3</a>: Treating data as immutable.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/data-validation.html">Principle #4</a>: Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div class="quoteblock">
<blockquote>
<em>Principle #4</em> — Separate data schema from data representation.
</blockquote>
</div>
</div>
</div>
<div class="sect1">
<h2 id="illustration-of-principle-4">Illustration of Principle #4</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Think about handling a request for the addition of an author to the system. To keep things simple, imagine that such a request contains only basic information about the author: their first name and last name and, optionally, the number of books they have written. As seen in Principle #2 (represent data with generic data structures), in DOP, request data is represented as a string map, where the map is expected to have three fields:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>firstName</code> — a string</p>
</li>
<li>
<p><code>lastName</code> — a string</p>
</li>
<li>
<p><code>books</code> — a number (optional)</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>In DOP, the expected shape of data is represented as data that is kept separate from the request data. For instance, JSON schema (<a href="https://json-schema.org/" class="bare">https://json-schema.org/</a>) can represent the data schema of the request with a map. The following listing provides an example.</p>
</div>
<div id="add-author-request-schema-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var addAuthorRequestSchema = {
"type": "object", // <b class="conum">(1)</b>
"required": ["firstName", "lastName"], // <b class="conum">(2)</b>
"properties": {
"firstName": {"type": "string"}, // <b class="conum">(3)</b>
"lastName": {"type": "string"}, // <b class="conum">(4)</b>
"books": {"type": "integer"} // <b class="conum">(5)</b>
}
};</code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>Data is expected to be a map (in JSON, a map is called an object).</p>
</li>
<li>
<p>Only <code>firstName</code> and <code>lastName</code> fields are required.</p>
</li>
<li>
<p><code>firstName</code> must be a string.</p>
</li>
<li>
<p><code>lastName</code> must be a string.</p>
</li>
<li>
<p><code>books</code> must be a number (when it is provided).</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>A data validation library is used to check whether a piece of data conforms to a data schema. For instance, we could use Ajv JSON schema validator (<a href="https://ajv.js.org/" class="bare">https://ajv.js.org/</a>) to validate data with the <code>validate</code> function that returns <code>true</code> when data is valid and <code>false</code> when data is invalid. The following listing shows this approach.</p>
</div>
<div id="check-data-validity-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var validAuthorData = {
firstName: "Isaac",
lastName: "Asimov",
books: 500
};
ajv.validate(addAuthorRequestSchema,
validAuthorData); // <b class="conum">(1)</b>
// → true
var invalidAuthorData = {
firstName: "Isaac",
lastNam: "Asimov",
books: "five hundred"
};
ajv.validate(addAuthorRequestSchema,
invalidAuthorData); // <b class="conum">(2)</b>
// → false</code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>Data is valid.</p>
</li>
<li>
<p>Data has <code>lastNam</code> instead of <code>lastName</code>, and <code>books</code> is a string instead of a number.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>When data is invalid, the details about data validation failures are available in a human readable format. The next listing shows this approach.</p>
</div>
<div id="data-validation-errors-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var invalidAuthorData = {
firstName: "Isaac",
lastNam: "Asimov",
books: "five hundred"
};
var ajv = new Ajv({allErrors: true}); // <b class="conum">(1)</b>
ajv.validate(addAuthorRequestSchema, invalidAuthorData);
ajv.errorsText(ajv.errors); // <2>
// → "data should have required property 'lastName',
// → data.books should be number"</code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>By default, Ajv stores only the first data validation error. Set <code>allErrors: true</code> to store all errors.</p>
</li>
<li>
<p>Data validation errors are stored internally as an array. In order to get a human readable string, use the <code>errorsText</code> function.</p>
</li>
</ol>
</div>
</div>
</div>
<div class="sect1">
<h2 id="benefits-of-principle-4">Benefits of Principle #4</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Separation of data schema from data representation provides numerous benefits. The following sections describe these benefits in detail:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Freedom to choose what data should be validated</p>
</li>
<li>
<p>Optional fields</p>
</li>
<li>
<p>Advanced data validation conditions</p>
</li>
<li>
<p>Automatic generation of data model visualization</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="benefit-1-freedom-to-choose-what-data-should-be-validated">Benefit #1: Freedom to choose what data should be validated</h3>
<div class="paragraph">
<p>When data schema is separated from data representation, we can instantiate data without specifying its expected shape. Such freedom is useful in various situations. For example,</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Rapid prototyping or experimentation</p>
</li>
<li>
<p>Code refactoring and data validation</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Consider rapid prototyping. In classic OOP, we need to instantiate every piece of data through a class. During the exploration phase of coding, when the final shape of our data is not yet known, being forced to update the class definition each time the data model changes slows us down. DOP enables a faster pace during the exploration phase by delaying the data schema definition to a later phase.</p>
</div>
<div class="paragraph">
<p>One common refactoring pattern is split phase refactoring (<a href="https://refactoring.com/catalog/splitPhase.html" class="bare">https://refactoring.com/catalog/splitPhase.html</a>), where a single large function is split into multiple smaller functions with private scope. We call these functions, with data that has already been validated by the larger function. In DOP, it is not necessary to specify the shape of the arguments of the inner functions, relying on the data validation that has already occurred.</p>
</div>
<div class="paragraph">
<p>Consider how to display some information about an author, such as their full name and whether they are considered prolific. Using the code shown earlier to illustrate Principle #2 to calculate the full name and the prolificity level of the author, one might come up with a <code>displayAuthorInfo</code> function as the following listing shows.</p>
</div>
<div id="author-info-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class NameCalculation {
static fullName(data) {
return data.firstName + " " + data.lastName;
}
}
class AuthorRating {
static isProlific (data) {
return data.books > 100;
}
}
var authorSchema = {
"type": "object",
"required": ["firstName", "lastName"],
"properties": {
"firstName": {"type": "string"},
"lastName": {"type": "string"},
"books": {"type": "integer"}
}
};
function displayAuthorInfo(authorData) {
if(!ajv.validate(authorSchema, authorData)) {
throw "displayAuthorInfo called with invalid data";
};
console.log("Author full name is: ",
NameCalculation.fullName(authorData));
if(authorData.books == null) {
console.log("Author has not written any book");
} else {
if (AuthorRating.isProlific(authorData)) {
console.log("Author is prolific");
} else {
console.log("Author is not prolific");
}
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Notice that the first thing done inside the body of <code>displayAuthorInfo</code> is to validate that the argument passed to the function. Now, apply the split phase refactoring pattern to this simple example and split the body of <code>displayAuthorInfo</code> into two inner functions:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>displayFullName</code> displays the author’s full name.</p>
</li>
<li>
<p><code>displayProlificity</code> displays whether the author is prolific or not.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The next listing shows the resulting code.</p>
</div>
<div id="author-info-refactored-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function displayFullName(authorData) {
console.log("Author full name is: ",
NameCalculation.fullName(authorData));
}
function displayProlificity(authorData) {
if(authorData.books == null) {
console.log("Author has not written any book");
} else {
if (AuthorRating.isProlific(authorData)) {
console.log("Author is prolific");
} else {
console.log("Author is not prolific");
}
}
}
function displayAuthorInfo(authorData) {
if(!ajv.validate(authorSchema, authorData)) {
throw "displayAuthorInfo called with invalid data";
};
displayFullName(authorData);
displayProlificity(authorData);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Having the data schema separated from data representation eliminates the need to specify a data schema for the arguments of the inner functions <code>displayFullName</code> and <code>displayProlificity</code>. It makes the refactoring process a bit smoother. In some cases, the inner functions are more complicated, and it makes sense to specify a data schema for their arguments. DOP gives us the freedom to choose!</p>
</div>
</div>
<div class="sect2">
<h3 id="benefit-2-optional-fields">Benefit #2: Optional fields</h3>
<div class="paragraph">
<p>In OOP, allowing a class member to be optional is not easy. For instance, in Java one needs a special construct like the <code>Optional</code> class introduced in Java 8 (<a href="http://mng.bz/4jWa" class="bare">http://mng.bz/4jWa</a>). In DOP, it is natural to declare a field as optional in a map. In fact, in JSON Schema, by default, every field is optional.</p>
</div>
<div class="paragraph">
<p>In order to make a field not optional, its name must be included in the <code>required</code> array as, for instance, in the author schema in the following listing, where only <code>firstName</code> and <code>lastName</code> are required, and <code>books</code> is optional. Notice that when an optional field is defined in a map, its value is validated against the schema.</p>
</div>
<div id="author-schema-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var authorSchema = {
"type": "object",
"required": ["firstName", "lastName"], // <b class="conum">(1)</b>
"properties": {
"firstName": {"type": "string"},
"lastName": {"type": "string"},
"books": {"type": "number"} // <b class="conum">(2)</b>
}
};</code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p><code>books</code> is not included in <code>required</code> as it is an optional field.</p>
</li>
<li>
<p>When present, <code>books</code> must be a number.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Let’s illustrate how the validation function deals with optional fields. A map without a <code>books</code> field is considered to be valid:</p>
</div>
<div id="author-no-books-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var authorDataNoBooks = {
"firstName": "Yehonathan",
"lastName": "Sharvit"
};
ajv.validate(authorSchema, authorDataNoBooks); // <b class="conum">(1)</b>
// → true </code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>The validation passes as <code>books</code> is an optional field.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Alternatively, a map with a <code>books</code> field, where the value is not a number, is considered to be invalid:</p>
</div>
<div id="author-invalid-books-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var authorDataInvalidBooks = {
"firstName": "Albert",
"lastName": "Einstein",
"books": "Five"
};
ajv.validate(authorSchema, authorDataInvalidBooks); // <b class="conum">(1)</b>
// → false </code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>The validation fails as <code>books</code> is not a number.</p>
</li>
</ol>
</div>
</div>
<div class="sect2">
<h3 id="benefit-3-advanced-data-validation-conditions">Benefit #3: Advanced data validation conditions</h3>
<div class="paragraph">
<p>In DOP, data validation occurs at run time. It allows the definition of data validation conditions that go beyond the type of a field. For example, validating that a field is not only a string, but a string with a maximal number of characters or a number comprised in a range of numbers as well.</p>
</div>
<div class="paragraph">
<p>JSON Schema supports many other advanced data validation conditions such as regular expression validation for string fields or number fields that should be a multiple of a given number. The author schema in the following listing expects <code>firstName</code> and <code>lastName</code> to be strings of less than 100 characters, and <code>books</code> to be a number between 0 and 10,000.</p>
</div>
<div id="author-schema-complex-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var authorComplexSchema = {
"type": "object",
"required": ["firstName", "lastName"],
"properties": {
"firstName": {
"type": "string",
"maxLength": 100
},
"lastName": {
"type": "string",
"maxLength": 100
},
"books": {
"type": "integer",
"minimum": 0,
"maximum": 10000
}
}
};</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="benefit-4-automatic-generation-of-data-model-visualization">Benefit #4: Automatic generation of data model visualization</h3>
<div class="paragraph">
<p>With the data schema defined as data, we can use several tools to generate data model visualizations. With tools like JSON Schema Viewer (<a href="https://navneethg.github.io/jsonschemaviewer/" class="bare">https://navneethg.github.io/jsonschemaviewer/</a>) and Malli (<a href="https://github.com/metosin/malli" class="bare">https://github.com/metosin/malli</a>), a UML diagram can be generated from a JSON schema.</p>
</div>
<div class="paragraph">
<p>For instance, the JSON schema in the following listing defines the shape of a <code>bookList</code> field, which is an array of books where each book is a map, and in the following figure, it is visualized as a UML diagram. These tools generate the UML diagram from the JSON schema.</p>
</div>
<div id="author-schema-visualize" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="json"><span></span><span class="tok-p">{</span>
<span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"object"</span><span class="tok-p">,</span>
<span class="tok-nt">"required"</span><span class="tok-p">:</span> <span class="tok-p">[</span><span class="tok-s2">"firstName"</span><span class="tok-p">,</span> <span class="tok-s2">"lastName"</span><span class="tok-p">],</span>
<span class="tok-nt">"properties"</span><span class="tok-p">:</span> <span class="tok-p">{</span>
<span class="tok-nt">"firstName"</span><span class="tok-p">:</span> <span class="tok-p">{</span><span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"string"</span><span class="tok-p">},</span>
<span class="tok-nt">"lastName"</span><span class="tok-p">:</span> <span class="tok-p">{</span><span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"string"</span><span class="tok-p">},</span>
<span class="tok-nt">"bookList"</span><span class="tok-p">:</span> <span class="tok-p">{</span>
<span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"array"</span><span class="tok-p">,</span>
<span class="tok-nt">"items"</span><span class="tok-p">:</span> <span class="tok-p">{</span>
<span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"object"</span><span class="tok-p">,</span>
<span class="tok-nt">"properties"</span><span class="tok-p">:</span> <span class="tok-p">{</span>
<span class="tok-nt">"title"</span><span class="tok-p">:</span> <span class="tok-p">{</span><span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"string"</span><span class="tok-p">},</span>
<span class="tok-nt">"publicationYear"</span><span class="tok-p">:</span> <span class="tok-p">{</span><span class="tok-nt">"type"</span><span class="tok-p">:</span> <span class="tok-s2">"integer"</span><span class="tok-p">}</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span></code></pre>
</div>
</div>
<div id="author-schema-uml" class="imageblock">
<div class="content">
<img src="/uml/chapter00/author-schema.png" alt="author schema">
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="cost-for-principle-4">Cost for Principle #4</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Applying Principle #4 comes with a price. The following sections look at these costs:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Weak connection between data and its schema</p>
</li>
<li>
<p>Small performance hit</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="cost-1-weak-connection-between-data-and-its-schema">Cost #1: Weak connection between data and its schema</h3>
<div class="paragraph">
<p>By definition, when data schema and data representation are separated, the connection between data and its schema is weaker that when data is represented with classes. Moreover, the schema definition language (e.g., JSON Schema) is not part of the programming language. It is up to the developer to decide where data validation is necessary and where it is superfluous. As the idiom says, with great power comes great responsibility.</p>
</div>
</div>
<div class="sect2">
<h3 id="cost-2-light-performance-hit">Cost #2: Light performance hit</h3>
<div class="paragraph">
<p>As mentioned earlier, there exist implementations of JSON schema validation in most programming languages. In DOP, data validation occurs at run time, and it takes some time to run the data validation. In OOP, data validation occurs usually at compile time.</p>
</div>
<div class="paragraph">
<p>This drawback is mitigated by the fact that, even in OOP, some parts of data validation occur at run time. For instance, the conversion of a request JSON payload into an object occurs at run time. Moreover, in DOP, it is quite common to have some data validation parts enabled only during development and to disable them when the system runs in production. As a consequence, this performance hit is not significant.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="summary-of-principle-4">Summary of Principle #4</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In DOP, data is represented with immutable generic data structures. When additional information about the shape of the data is required, a data schema can be defined (e.g., using JSON Schema). Keeping the data schema separate from the data representation gives us the freedom to decide where data should be validated.</p>
</div>
<div class="paragraph">
<p>Moreover, data validation occurs at run time. As a consequence, data validation conditions that go beyond the static data types (e.g., the string length) can be expressed. However, with great power comes great responsibility, and it is up to the developer to remember to validate data.</p>
</div>
<div class="paragraph">
<p><strong>DOP Principle #4: Separate between data schema and data representation</strong></p>
</div>
<div class="paragraph">
<p>To adhere to this principle, separate between data schema and data representation.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principle-4-schema.png" alt="do principle 4 schema">
</div>
</div>
<div class="paragraph">
<p>Benefits include</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Freedom to choose what data should be validated</p>
</li>
<li>
<p>Optional fields</p>
</li>
<li>
<p>Advanced data validation conditions</p>
</li>
<li>
<p>Automatic generation of data model visualization</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The cost for implementing Principle #4 includes</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Weak connection between data and its schema</p>
</li>
<li>
<p>A small performance hit</p>
</li>
</ul>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
</div>
</div>Yehonathan SharvitWith data separated from code and represented with generic and immutable data structures, now comes the question of how do we express the shape of the data? In DOP, the expected shape is expressed as a data schema that is kept separated from the data itself. The main benefit of Principle #4 is that it allows developers to decide which pieces of data should have a schema and which pieces of data should not.Data is immutable2022-06-22T04:34:24+02:002022-06-22T04:34:24+02:00/databook/2022/06/22/immutable-data<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>With data separated from code and represented with generic data structures, how are changes to the data managed? DOP is very strict on this question. Mutation of data is not allowed! In DOP, changes to data are accomplished by creating new versions of the data. The <em>reference</em> to a variable may be changed so that it refers to a new version of the data, but the <em>value</em> of the data itself must never change.</p>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/immutable/4.0.0/immutable.min.js"></script>
<div class="paragraph">
<p>This article is an exploration of the third principle of Data-Oriented Programming. The other principles of DOP are explored here:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a>: Separating code (behavior) from data.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>: Representing data with generic data structures.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/immutable-data.html">Principle #3</a>: Treating data as immutable.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/data-validation.html">Principle #4</a>: Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div class="quoteblock">
<blockquote>
<em>Principle #3</em> — Data is immutable.
</blockquote>
</div>
</div>
</div>
<div class="sect1">
<h2 id="illustration-of-principle-3">Illustration of Principle #3</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Think about the number 42. What happens to 42 when you add 1 to it? Does it become 43? No, 42 stays 42 forever! Now, put 42 inside an object: <code>{num: 42}</code>. What happens to the object when you add 1 to 42? Does it become 43? It depends on the programming language.</p>
</div>
<div class="ulist">
<ul>
<li>
<p>In Clojure, a programming language that embraces data immutability, the value of the <code>num</code> field stays <code>42</code> forever, no matter what.</p>
</li>
<li>
<p>In many programming languages, the value of the <code>num</code> field becomes <code>43</code>.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>For instance, in JavaScript, mutating the field of a map referred by two variables has an impact on both variables. The following listing demonstrates this.</p>
</div>
<div id="mutating-data-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var myData = {num: 42};
var yourData = myData;
yourData.num = yourData.num + 1;
console.log(myData.num);
// → 43</code></pre>
</div>
</div>
<div class="paragraph">
<p>Now, <code>myData.num</code> equals <code>43</code>. According to DOP, however, data should never change! Instead of mutating data, a new version of it is created. A naive (and inefficient) way to create a new version of a data is to clone it before modifying it. For instance, in the following listing, there is a function that changes the value of a field inside an object by cloning the object via <code>Object.assign</code>, provided natively by JavaScript. When <code>changeValue</code> is called on <code>myData</code>, <code>myData</code> is not affected; <code>myData.num</code> remains <code>42</code>. This is the essence of data immutability!</p>
</div>
<div id="cloning-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function changeValue(obj, k, v) {
var res = Object.assign({}, obj);
res[k] = v;
return res;
}
var myData = {num: 42};
var yourData = changeValue(myData, "num", myData.num + 1);
console.log(myData.num);
// → 42</code></pre>
</div>
</div>
<div class="paragraph">
<p>Embracing immutability in an efficient way, both in terms of computation and memory, requires a third-party library like Immutable.js (<a href="https://immutable-js.com/" class="bare">https://immutable-js.com/</a>), which provides an efficient implementation of persistent data structures (a.k.a. immutable data structures). In most programming languages, there exist libraries that provide an efficient implementation of persistent data structures.</p>
</div>
<div class="paragraph">
<p>With <code>Immutable.js</code>, JavaScript native maps and arrays are not used, but rather, immutable maps and immutable lists instantiated via <code>Immutable.Map</code> and <code>Immutable.List</code>. An element of a map is accessed using the <code>get</code> method. A new version of the map is created when a field is modified with the <code>set</code> method.</p>
</div>
<div class="paragraph">
<p>Here is how to create and manipulate immutable data efficiently with a third-party library. In the output, <code>yourData.get("num")</code> is <code>43</code>, but <code>myData.get("num")</code> remains <code>42</code>.</p>
</div>
<div id="immutable-js-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var myData = Immutable.Map({num: 42})
var yourData = myData.set("num", 43);
console.log(yourData.get("num"));
// → 43
console.log(myData.get("num"));
// → 42</code></pre>
</div>
</div>
<div class="quoteblock">
<blockquote>
When data is immutable, instead of mutating data, a new version of it is created.
</blockquote>
</div>
</div>
</div>
<div class="sect1">
<h2 id="benefits-of-principle-3">Benefits of Principle #3</h2>
<div class="sectionbody">
<div class="paragraph">
<p>When programs are constrained from mutating data, we derive benefit in numerous ways. The following sections detail these benefits:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Data access to all with confidence</p>
</li>
<li>
<p>Predictable code behavior</p>
</li>
<li>
<p>Fast equality checks</p>
</li>
<li>
<p>Concurrency safety for free</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="benefit-1-data-access-to-all-with-confidence">Benefit #1: Data access to all with confidence</h3>
<div class="paragraph">
<p>According to Principle #1 (separate code from data), data access is transparent. Any function is allowed to access any piece of data. Without data immutability, we must be careful when passing data as an argument to a function. We can either make sure the function does not mutate the data or clone the data before it is passed to the function. When adhering to data immutability, none of this is required.</p>
</div>
<div class="quoteblock">
<blockquote>
When data is immutable, it can be passed to any function with confidence because data never changes.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="benefit-2-predictable-code-behavior">Benefit #2: Predictable code behavior</h3>
<div class="paragraph">
<p>As an illustration of what is meant by <em>predictable</em>, here is an example of an <em>unpredictable</em> piece of code that does not adhere to data immutability. Take a look at the piece of asynchronous JavaScript code in the following listing. When data is mutable, the behavior of asynchronous code is not predictable.</p>
</div>
<div id="async-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var myData = {num: 42};
setTimeout(function (data){
console.log(data.num);
}, 1000, myData);
myData.num = 0;</code></pre>
</div>
</div>
<div class="paragraph">
<p>The value of <code>data.num</code> inside the timeout callback is not predictable. It depends on whether the data is modified by another piece of code during the 1,000 ms of the timeout. However, with immutable data, it is guaranteed that data never changes and that <code>data.num</code> is always <code>42</code> inside the callback.</p>
</div>
<div class="quoteblock">
<blockquote>
When data is immutable, the behavior of code that manipulates data is predictable.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="benefit-3-fast-equality-checks">Benefit #3: Fast equality checks</h3>
<div class="paragraph">
<p>With UI frameworks like React.js, there are frequent checks to see what portion of the UI data has been modified since the previous rendering cycle. Portions that did not change are not rendered again. In fact, in a typical frontend application, most of the UI data is left unchanged between subsequent rendering cycles.</p>
</div>
<div class="paragraph">
<p>In a React application that does not adhere to data immutability, it is necessary to check every (nested) part of the UI data. However, in a React application that follows data immutability, it is possible to optimize the comparison of the data for the case where data is not modified. Indeed, when the object address is the same, then it is certain that the data did not change.</p>
</div>
<div class="paragraph">
<p>Comparing object addresses is much faster than comparing all the fields. In Part 1 of my book, fast equality checks are used to reconcile between concurrent mutations in a highly scalable production system.</p>
</div>
<div class="quoteblock">
<blockquote>
Immutable data enables fast equality checks by comparing data by reference.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="benefit-4-free-concurrency-safety">Benefit #4: Free concurrency safety</h3>
<div class="paragraph">
<p>In a multi-threaded environment, concurrency safety mechanisms (e.g., mutexes) are often used to prevent the data in thread <code>A</code> from being modified while it is accessed in thread <code>B</code>. In addition to the slight performance hit they cause, concurrency safety mechanisms impose a mental burden that makes code writing and reading much more difficult.</p>
</div>
<div class="quoteblock">
<blockquote>
Adherence to data immutability eliminates the need for a concurrency mechanism. The data you have in hand never changes!
</blockquote>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="cost-for-principle-3">Cost for Principle #3</h2>
<div class="sectionbody">
<div class="paragraph">
<p>As with the previous principles, applying Principle #3 comes at a price. The following sections look at these costs:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Performance hit</p>
</li>
<li>
<p>Required library for persistent data structures</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="cost-1-performance-hit">Cost #1: Performance hit</h3>
<div class="paragraph">
<p>As mentioned earlier, there exist implementations of persistent data structures in most programming languages. But even the most efficient implementation is a bit slower than the in-place mutation of the data. In most applications, the performance hit and the additional memory consumption involved in using immutable data structures is not significant. But this is something to keep in mind.</p>
</div>
</div>
<div class="sect2">
<h3 id="cost-2-required-library-for-persistent-data-structures">Cost #2: Required library for persistent data structures</h3>
<div class="paragraph">
<p>In a language like Clojure, the native data structures of the language are immutable. However, in most programming languages, adhering to data immutability requires the inclusion a third-party library that provides an implementation of persistent data structures.</p>
</div>
<div class="paragraph">
<p>The fact that the data structures are not native to the language means that it is difficult (if not impossible) to enforce the usage of immutable data across the board. Also, when integrating with third-party libraries (e.g., a chart library), persistent data structures must be converted into equivalent native data structures.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="summary-of-principle-3">Summary of Principle #3</h2>
<div class="sectionbody">
<div class="paragraph">
<p>DOP considers data as a value that never changes. Adherence to this principle results in code that is predictable even in a multi-threaded environment, and equality checks are fast. However, a non-negligible mind shift is required, and in most programming languages, a third-party library is needed to provide an efficient implementation of persistent data structures.</p>
</div>
<div class="paragraph">
<p><strong>DOP Principle #3: Data is immutable</strong></p>
</div>
<div class="paragraph">
<p>To adhere to this principle, data is represented with immutable structures.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principle-3-immutable-data.png" alt="do principle 3 immutable data">
</div>
</div>
<div class="paragraph">
<p>Benefits include</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Data access to all with confidence</p>
</li>
<li>
<p>Predictable code behavior</p>
</li>
<li>
<p>Fast equality checks</p>
</li>
<li>
<p>Concurrency safety for free</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The cost for implementing Principle #3 includes</p>
</div>
<div class="ulist">
<ul>
<li>
<p>A performance hit</p>
</li>
<li>
<p>Required library for persistent data structures</p>
</li>
</ul>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
</div>
</div>Yehonathan SharvitWith data separated from code and represented with generic data structures, how are changes to the data managed? DOP is very strict on this question. Mutation of data is not allowed! In DOP, changes to data are accomplished by creating new versions of the data. The reference to a variable may be changed so that it refers to a new version of the data, but the value of the data itself must never change.Represent data with generic data structures2022-06-22T04:33:24+02:002022-06-22T04:33:24+02:00/databook/2022/06/22/generic-data-structures<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>When adhering to <a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1 of DOP</a>, code is separated from data. DOP is not opinionated about the programming constructs to use for organizing the code, but it has a lot to say about how the data should be represented. This is the theme of Principle #2.</p>
</div>
<div class="paragraph">
<p>The most common generic data structures are maps (a.k.a. dictionaries) and arrays (or lists). But other generic data structures (e.g., sets, trees, and queues) can be used as well. Principle #2 does not deal with the mutability or the immutability of the data. That is the theme of Principle #3.</p>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.4/lodash.min.js" integrity="sha256-8E6QUcFg1KTnpEU8TFGhpTGHw5fJqB9vCms3OhAYLqw=" crossorigin="anonymous"></script>
<div class="paragraph">
<p>This article is an exploration of the second principle of Data-Oriented Programming. The other principles of DOP are explored here:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a>: Separating code (behavior) from data.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>: Representing data with generic data structures.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/immutable-data.html">Principle #3</a>: Treating data as immutable.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/data-validation.html">Principle #4</a>: Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div class="quoteblock">
<blockquote>
<em>Principle #2</em> — Represent application data with generic data structures.
</blockquote>
</div>
</div>
</div>
<div class="sect1">
<h2 id="illustration-of-principle-2">Illustration of Principle #2</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In DOP, data is represented with generic data structures (like maps and arrays) instead of instantiating data via specific classes. In fact, most of the data entities that appear in a typical application can be represented with maps and arrays (or lists). But there exist other generic data structures (e.g., sets, lists, queues, etc.) that might be required in some use cases. Let’s look at the same simple example we used to illustrate <a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a> (data that represents an author).</p>
</div>
<div class="paragraph">
<p>An author is a data entity with a <code>firstName</code>, a <code>lastName</code>, and the number of <code>books</code> they have written. Principle #2 is broken when we use a specific class to represent an author as this listing reveals.</p>
</div>
<div id="break-2-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class AuthorData {
constructor(firstName, lastName, books) {
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Principle #2 is followed when using a map (a dictionary or an associative array) as a generic data structure that represents an author. The following listing illustrates how we can follow this principle in OOP.</p>
</div>
<div id="follow-2-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorData(firstName, lastName, books) {
var data = new Map;
data.firstName = firstName;
data.lastName = lastName;
data.books = books;
return data;
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>In a language like JavaScript, we can also instantiate a map via a data literal, which is a bit more convenient. The following listing shows an example.</p>
</div>
<div id="follow-2-literal-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorData(firstName, lastName, books) {
return {
firstName: firstName,
lastName: lastName,
books: books
};
}</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="benefits-of-principle-2">Benefits of Principle #2</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Using generic data structures to represent data has multiple benefits. We cover these benefits in greater detail in the following sections:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The ability to use generic functions that are not limited to our specific use case</p>
</li>
<li>
<p>A flexible data model</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="using-functions-that-are-not-limited-to-a-specific-use-case">Using functions that are not limited to a specific use case</h3>
<div class="paragraph">
<p>Using generic data structures to represent data makes it possible to manipulate data with a rich set of functions that are available on those data structures natively in our programming language. Additionally, third-party libraries also provide more of these functions. For instance, JavaScript natively provides some basic functions on maps and arrays, and third-party libraries like Lodash (<a href="https://lodash.com/" class="bare">https://lodash.com/</a>) extend the functionality with even more functions. There is a famous quote by Alan Perlis that summarizes this benefit:</p>
</div>
<div class="quoteblock">
<blockquote>
<div class="paragraph">
<p>It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.</p>
</div>
</blockquote>
</div>
<div class="paragraph">
<p>When an author is represented as a map, the author can be serialized into JSON using <code>JSON.stringify()</code>, which is part of JavaScript. The following listing provides an example.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var data = createAuthorData("Isaac", "Asimov", 500);
JSON.stringify(data);
// → "{\"firstName\":\"Isaac\",\"lastName\":\"Asimov\",\"books\":500}"</code></pre>
</div>
</div>
<div class="paragraph">
<p>Serializing author data without the number of books can be accomplished via Lodash’s <code>_.pick()</code> function. The following listing uses <code>_.pick()</code> to create an object with a subset of keys.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var data = createAuthorData("Isaac", "Asimov", 500);
var dataWithoutBooks = _.pick(data, ["firstName", "lastName"]);
JSON.stringify(dataWithoutBooks);
// → "{\"firstName\":\"Isaac\",\"lastName\":\"Asimov\"}"</code></pre>
</div>
</div>
<div class="quoteblock">
<blockquote>
When adhering to Principle #2, a rich set of functionality is available for data manipulation.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="flexible-data-model">Flexible data model</h3>
<div class="paragraph">
<p>When using generic data structures, the data model is flexible, and data is not forced into a specific shape. Data can be created with no predefined shape, and its shape can be modified at will.</p>
</div>
<div class="paragraph">
<p>In classic OOP, when <em>not</em> adhering to Principle #2, each piece of data is instantiated via a class and must follow a rigid shape. When a slightly different data shape is needed, a new class must be defined. Take, for example, <code>AuthorData</code>, a class that represents an author entity made of three fields: <code>firstName</code>, <code>lastName</code>, and <code>books</code>. Suppose that you want to add a field called <code>fullName</code> with the full name of the author. If we fail to adhere to Principle #2, a new class <code>AuthorDataWithFullName</code> must be defined. However, when using generic data structures, fields can be added to (or removed from) a map <em>on the fly</em> as the following listing shows.</p>
</div>
<div id="fly-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var data = createAuthorData("Isaac", "Asimov", 500);
data.fullName = "Isaac Asimov";</code></pre>
</div>
</div>
<div class="quoteblock">
<blockquote>
Working with a flexible data model is particularly useful in applications where the shape of the data tends to be dynamic (e.g., web apps and web services).
</blockquote>
</div>
<div class="paragraph">
<p>Part 1 of my book explores in detail the benefits of a flexible data model in real-world applications. Next, let’s explore the cost for adhering to Principle #2.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="cost-for-principle-2">Cost for Principle #2</h2>
<div class="sectionbody">
<div class="paragraph">
<p>As with any programming principle, using this principle comes with its own set of trade-offs. The price paid for representing data with generic data structures is as follows:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>There is a slight performance hit.</p>
</li>
<li>
<p>No data schema is required.</p>
</li>
<li>
<p>No compile-time check that the data is valid is necessary.</p>
</li>
<li>
<p>In some statically typed languages, type casting is needed.</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="cost-1-performance-hit">Cost #1: Performance hit</h3>
<div class="paragraph">
<p>When specific classes are used to instantiate data, retrieving the value of a class member is fast because the compiler knows how the data will look and can do many optimizations. With generic data structures, it is harder to optimize, so retrieving the value associated to a key in a map, for example, is a bit slower that retrieving the value of a class member. Similarly, setting the value of an arbitrary key in a map is a bit slower that setting the value of a class member. In most programming languages, this performance hit is not significant, but it is something to keep in mind.</p>
</div>
<div class="quoteblock">
<blockquote>
Retrieving and storing the value associated to an arbitrary key from a map is a bit slower than with a class member.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="cost-2-no-data-schema">Cost #2: No data schema</h3>
<div class="paragraph">
<p>When data is instantiated from a class, the information about the data shape is in the class definition. Every piece of data has an associated data shape. The existence of data schema at a class level is useful for developers and for IDEs because</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Developers can easily discover the expected data shape.</p>
</li>
<li>
<p>IDEs provide features like field name autocompletion.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>When data is represented with generic data structures, the data schema is not part of the data representation. As a consequence, some pieces of data might have an associated data schema and other pieces of data do not (see Principle #4).</p>
</div>
<div class="quoteblock">
<blockquote>
When generic data structures are used to store data, the data shape is not part of the data representation.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="cost-3-no-compile-time-check-that-the-data-is-valid">Cost #3: No compile-time check that the data is valid</h3>
<div class="paragraph">
<p>Look again at the <code>fullName</code> function in the following listing, which was created to explore Principle #1. This function receives the data it manipulates as an argument.</p>
</div>
<div id="fullname-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function fullName(data) {
return data.firstName + " " + data.lastName;
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>When data is passed to <code>fullName</code> that does not conform to the shape <code>fullName</code> expects, an error occurs at run time. With generic data structures, mistyping the field storing the first name (e.g., <code>fistName</code> instead of <code>firstName</code>) does not result in a compile-time error or an exception. Rather, <code>firstName</code> is mysteriously omitted from the result. The following listing shows this unexpected behavior.</p>
</div>
<div id="weird-behavior-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>fullName({fistName: "Issac", lastName: "Asimov"});
// → "undefined Asimov"</code></pre>
</div>
</div>
<div class="paragraph">
<p>When we instantiate data via classes with a rigid data shape, this type of error is caught at compile time. This drawback is mitigated by the application of Principle #4 that deals with data validation.</p>
</div>
<div class="quoteblock">
<blockquote>
When data is represented with generic data structures, data shape errors are caught only at run time.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="cost-4-the-need-for-explicit-type-casting">Cost #4: The need for explicit type casting</h3>
<div class="paragraph">
<p>In some statically typed languages, explicit type casting is needed. This section takes a look at explicit type casting in Java and at dynamic fields in C#.</p>
</div>
<div class="paragraph">
<p>In a statically typed language like Java, author data can be represented as a map whose keys are of type <code>string</code> and whose values are of types <code>Object</code>. For example, in Java, author data is represented by a <code>Map<String, Object></code> as the following listing illustrates.</p>
</div>
<div id="author-data-java" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="java"><span></span><span class="tok-kd">var</span> <span class="tok-n">asimov</span> <span class="tok-o">=</span> <span class="tok-k">new</span> <span class="tok-n">HashMap</span><span class="tok-o"><</span><span class="tok-n">String</span><span class="tok-p">,</span> <span class="tok-n">Object</span><span class="tok-o">></span><span class="tok-p">();</span>
<span class="tok-n">asimov</span><span class="tok-p">.</span><span class="tok-na">put</span><span class="tok-p">(</span><span class="tok-s">"firstName"</span><span class="tok-p">,</span> <span class="tok-s">"Isaac"</span><span class="tok-p">);</span>
<span class="tok-n">asimov</span><span class="tok-p">.</span><span class="tok-na">put</span><span class="tok-p">(</span><span class="tok-s">"lastName"</span><span class="tok-p">,</span> <span class="tok-s">"Asimov"</span><span class="tok-p">);</span>
<span class="tok-n">asimov</span><span class="tok-p">.</span><span class="tok-na">put</span><span class="tok-p">(</span><span class="tok-s">"books"</span><span class="tok-p">,</span> <span class="tok-mi">500</span><span class="tok-p">);</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Because the information about the exact type of the field values is not available at compile time, when accessing a field, an explicit type cast is required. For instance, in order to check whether an author is prolific, the value of the <code>books</code> field must be type cast to an integer as this listing shows.</p>
</div>
<div id="is-prolific-java" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="java"><span></span><span class="tok-kd">class</span> <span class="tok-nc">AuthorRating</span> <span class="tok-p">{</span>
<span class="tok-kd">static</span> <span class="tok-kt">boolean</span> <span class="tok-nf">isProlific</span> <span class="tok-p">(</span><span class="tok-n">Map</span><span class="tok-o"><</span><span class="tok-n">String</span><span class="tok-p">,</span> <span class="tok-n">Object</span><span class="tok-o">></span> <span class="tok-n">data</span><span class="tok-p">)</span> <span class="tok-p">{</span>
<span class="tok-k">return</span> <span class="tok-p">(</span><span class="tok-kt">int</span><span class="tok-p">)</span><span class="tok-n">data</span><span class="tok-p">.</span><span class="tok-na">get</span><span class="tok-p">(</span><span class="tok-s">"books"</span><span class="tok-p">)</span> <span class="tok-o">></span> <span class="tok-mi">100</span><span class="tok-p">;</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Some Java JSON serialization libraries like Gson (<a href="https://github.com/google/gson" class="bare">https://github.com/google/gson</a>) support serialization of maps of type <code>Map<String, Object></code>, without requiring the user to do any type casting. All the magic happens behind the scenes!</p>
</div>
<div class="paragraph">
<p>C# supports a dynamic data type called <code>dynamic</code> (see <a href="http://mng.bz/voqJ" class="bare">http://mng.bz/voqJ</a>), which allows type checking to occur at run time. Using this feature, author data is represented as a dictionary, where the keys are of type <code>string</code>, and the values are of type <code>dynamic</code>. The next listing provides this representation.</p>
</div>
<div id="author-data-cs" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="csharp"><span></span><span class="tok-kt">var</span> <span class="tok-n">asimov</span> <span class="tok-p">=</span> <span class="tok-k">new</span> <span class="tok-n">Dictionary</span><span class="tok-p"><</span><span class="tok-kt">string</span><span class="tok-p">,</span> <span class="tok-kt">dynamic</span><span class="tok-p">>();</span>
<span class="tok-n">asimov</span><span class="tok-p">[</span><span class="tok-s">"name"</span><span class="tok-p">]</span> <span class="tok-p">=</span> <span class="tok-s">"Isaac Asimov"</span><span class="tok-p">;</span>
<span class="tok-n">asimov</span><span class="tok-p">[</span><span class="tok-s">"books"</span><span class="tok-p">]</span> <span class="tok-p">=</span> <span class="tok-m">500</span><span class="tok-p">;</span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The information about the exact type of the field values is resolved at run time. When accessing a field, no type cast is required. For instance, when checking whether an author is prolific, the <code>books</code> field can be accessed as though it were declared as an integer as in this listing.</p>
</div>
<div id="is-prolific-cs" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="csharp"><span></span><span class="tok-k">class</span> <span class="tok-nc">AuthorRating</span> <span class="tok-p">{</span>
<span class="tok-k">public</span> <span class="tok-k">static</span> <span class="tok-kt">bool</span> <span class="tok-nf">isProlific</span> <span class="tok-p">(</span><span class="tok-n">Dictionary</span><span class="tok-p"><</span><span class="tok-n">String</span><span class="tok-p">,</span> <span class="tok-kt">dynamic</span><span class="tok-p">></span> <span class="tok-n">data</span><span class="tok-p">)</span> <span class="tok-p">{</span>
<span class="tok-k">return</span> <span class="tok-n">data</span><span class="tok-p">[</span><span class="tok-s">"books"</span><span class="tok-p">]</span> <span class="tok-p">></span> <span class="tok-m">100</span><span class="tok-p">;</span>
<span class="tok-p">}</span>
<span class="tok-p">}</span></code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="summary">Summary</h2>
<div class="sectionbody">
<div class="paragraph">
<p>DOP uses generic data structures to represent data. This might cause a (small) performance hit and impose the need to manually document the shape of data because the compiler cannot validate it statically. Adherence to this principle enables the manipulation of data with a rich set of generic functions (provided by the language and by third-party libraries). Additionally, our data model is flexible. At this point, the data can be either mutable or immutable. The <a href="/databook/2022/06/22/immutable-data.html">next principle</a> illustrates the value of immutability.</p>
</div>
<div class="paragraph">
<p><strong>DOP Principle #2: Represent data with generic data structures</strong></p>
</div>
<div class="paragraph">
<p>To comply to this principle, we represent application data with generic data structures, mostly maps and arrays (or lists). The following figure provides a diagram as a visual representation of this principle.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principle-2-generic-data.png" alt="do principle 2 generic data">
</div>
</div>
<div class="paragraph">
<p>Benefits include</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Using generic functions that are not limited to our specific use case</p>
</li>
<li>
<p>A flexible data model</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The cost for implementing this principle includes</p>
</div>
<div class="ulist">
<ul>
<li>
<p>There is a slight performance hit.</p>
</li>
<li>
<p>No data schema is required.</p>
</li>
<li>
<p>No compile time check that the data is valid is necessary.</p>
</li>
<li>
<p>In some statically typed languages, explicit type casting is needed.</p>
</li>
</ul>
</div>
</div>
</div>Yehonathan SharvitWhen adhering to Principle #1 of DOP, code is separated from data. DOP is not opinionated about the programming constructs to use for organizing the code, but it has a lot to say about how the data should be represented. This is the theme of Principle #2.Separate code from data2022-06-22T04:32:24+02:002022-06-22T04:32:24+02:00/databook/2022/06/22/separate-code-from-data<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>The first principle of Data-Oriented Programming (DOP) is a design principle that recommends a clear separation between code (behavior) and data. This may appear to be a FP principle, but in fact, one can adhere to it or break it either in FP or in OOP:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Adherence to this principle in OOP means aggregating the code as methods of a static class.</p>
</li>
<li>
<p>Breaking this principle in FP means hiding state in the lexical scope of a function.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Also, this principle does not relate to the way data is represented. Data representation is addressed by <a href="/databook/2022/06/22/generic-data-structures.html">Principle #2: Represent data with generic data structures</a>.</p>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<div class="paragraph">
<p>This article is an exploration of the first principle of Data-Oriented Programming. The other principles of DOP are explored here:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a>: Separating code (behavior) from data.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>: Representing data with generic data structures.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/immutable-data.html">Principle #3</a>: Treating data as immutable.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/data-validation.html">Principle #4</a>: Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div class="quoteblock">
<blockquote>
<em>Principle #1</em> — Separate code from data in a way that the code resides in functions whose behavior does not depend on data that is encapsulated in the function’s context.
</blockquote>
</div>
</div>
</div>
<div class="sect1">
<h2 id="illustration-of-principle-1">Illustration of Principle #1</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Our exploration of Principle #1 begins by illustrating how it can be applied to OOP and FP. The following sections illustrate how this principle can be adhered to or broken in a simple program that deals with:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>An author entity with a <code>firstName</code>, a <code>lastName</code>, and the number of <code>books</code> they wrote.</p>
</li>
<li>
<p>A piece of code that calculates the full name of the author.</p>
</li>
<li>
<p>A piece of code that determines if an author is prolific, based on the number of books they wrote.</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="breaking-principle-1-in-oop">Breaking Principle #1 in OOP</h3>
<div class="paragraph">
<p>Breaking Principle #1 in OOP happens when we write code that combines data and code together in an object. The following listing demonstrates what this looks like.</p>
</div>
<div id="break-1-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class Author {
constructor(firstName, lastName, books) {
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
fullName() {
return this.firstName + " " + this.lastName;
}
isProlific() {
return this.books > 100;
}
}
var obj = new Author("Isaac", "Asimov", 500); // <b class="conum">(1)</b>
obj.fullName();
// → "Isaac Asimov"</code></pre>
</div>
</div>
<div class="colist arabic">
<ol>
<li>
<p>Isaac Asimov really wrote around 500 books!</p>
</li>
</ol>
</div>
</div>
<div class="sect2">
<h3 id="breaking-principle-1-in-fp">Breaking Principle #1 in FP</h3>
<div class="paragraph">
<p>Breaking this principle without classes in FP means hiding data in the lexical scope of a function. The next listing provides an example of this.</p>
</div>
<div id="break-1-fp-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorObject(firstName, lastName, books) {
return {
fullName: function() {
return firstName + " " + lastName;
},
isProlific: function () {
return books > 100;
}
};
}
var obj = createAuthorObject("Isaac", "Asimov", 500);
obj.fullName();
// → "Isaac Asimov"</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="adhering-to-principle-1-in-oop">Adhering to Principle #1 in OOP</h3>
<div class="paragraph">
<p>The following listing shows an example that adheres to Principle #1 in OOP. Compliance with this principle may be achieved even with classes by writing programs such that:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The code consists of static methods.</p>
</li>
<li>
<p>The data is encapsulated in data classes (classes that are merely containers of data).</p>
</li>
</ul>
</div>
<div id="compliant-1-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class AuthorData {
constructor(firstName, lastName, books) {
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
}
class NameCalculation {
static fullName(data) {
return data.firstName + " " + data.lastName;
}
}
class AuthorRating {
static isProlific (data) {
return data.books > 100;
}
}
var data = new AuthorData("Isaac", "Asimov", 500);
NameCalculation.fullName(data);
// → "Isaac Asimov"</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="adhering-to-principle-1-in-fp">Adhering to Principle #1 in FP</h3>
<div class="paragraph">
<p>Here is an example that adheres to Principle #1 in FP. Compliance with this principle means separating code from data.</p>
</div>
<div id="compliant-1-fp-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorData(firstName, lastName, books) {
return {
firstName: firstName,
lastName: lastName,
books: books
};
}
function fullName(data) {
return data.firstName + " " + data.lastName;
}
function isProlific (data) {
return data.books > 100;
}
var data = createAuthorData("Isaac", "Asimov", 500);
fullName(data);
// → "Isaac Asimov"</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="benefits-of-principle-1">Benefits of Principle #1</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Having illustrated how to follow or break Principle #1 both in OOP and FP, let’s look at the benefits that Principle #1 brings to our programs. Careful separation of code from data benefits our programs in the following ways:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Code can be reused in different contexts.</p>
</li>
<li>
<p>Code can be tested in isolation.</p>
</li>
<li>
<p>Systems tend to be less complex.</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="benefit-1-code-can-be-reused-in-different-contexts">Benefit #1: Code can be reused in different contexts</h3>
<div class="paragraph">
<p>Imagine that besides the author entity, there is a user entity that has nothing to do with authors but has two of the same data fields as the author entity: <code>firstName</code> and <code>lastName</code>. The logic of calculating the full name is the same for authors and users — retrieving the values of two fields with the same names. However, in traditional OOP as in the version with <code>createAuthorObject</code> in listing below, the code of <code>fullName</code> cannot be reused on a user in a <em>straightforward</em> way because it is locked inside the <code>Author</code> class.</p>
</div>
<div id="full-name-oo-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class Author {
constructor(firstName, lastName, books) {
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
fullName() {
return this.firstName + " " + this.lastName;
}
isProlific() {
return this.books > 100;
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>One way to achieve code re-usability when code and data are mixed is to use OOP mechanisms like inheritance or composition to let the <code>User</code> and <code>Author</code> classes use the same <code>fullName</code> method. These techniques are adequate for simple use cases, but in real-world systems, the abundance of classes (either base classes or composite classes) tends to increase complexity.</p>
</div>
<div class="paragraph">
<p>Here is a simple way to avoid inheritance. In this listing, we duplicate the code of <code>fullName</code> inside a <code>createUserObject</code> function.</p>
</div>
<div id="oo-duplicate-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorObject(firstName, lastName, books) {
var data = {firstName: firstName, lastName: lastName, books: books};
return {
fullName: function fullName() {
return data.firstName + " " + data.lastName;
}
};
}
function createUserObject(firstName, lastName, email) {
var data = {firstName: firstName, lastName: lastName, email: email};
return {
fullName: function fullName() {
return data.firstName + " " + data.lastName;
}
};
}
var obj = createUserObject("John", "Doe", "john@doe.com");
obj.fullName();
// → "John Doe"</code></pre>
</div>
</div>
<div class="paragraph">
<p>In DOP, no modification to the code that deals with author entities is necessary in order to make it available to user entities, because:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The code that deals with full name calculation is separate from the code that deals with the creation of author data.</p>
</li>
<li>
<p>The function that calculates the full name works with any hash map that has a <code>firstName</code> and a <code>lastName</code> field.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>It is possible to leverage the fact that data relevant to the full name calculation for a user and an author has the same shape. With no modifications, the <code>fullName</code> function works properly both on author data and on user data as the following listing shows.</p>
</div>
<div id="same-code-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>function createAuthorData(firstName, lastName, books) {
return {firstName: firstName, lastName: lastName, books: books};
}
function fullName(data) {
return data.firstName + " " + data.lastName;
}
function createUserData(firstName, lastName, email) {
return {firstName: firstName, lastName: lastName, email: email};
}
var authorData = createAuthorData("Isaac", "Asimov", 500);
fullName(authorData);
var userData = createUserData("John", "Doe", "john@doe.com");
fullName(userData);
// → "John Doe"</code></pre>
</div>
</div>
<div class="paragraph">
<p>When Principle #1 is applied in OOP, code reuse is straightforward even when classes are used. In statically typed OOP languages like Java or C# we would have to create a common interface for <code>AuthorData</code> and <code>UserData</code>. In a dynamically typed language like JavaScript, however, that is not required. The code of <code>NameCalculation.fullName()</code> works both with author data and user data as the next listing demonstrates.</p>
</div>
<div id="same-code-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>class AuthorData {
constructor(firstName, lastName, books) {
this.firstName = firstName;
this.lastName = lastName;
this.books = books;
}
}
class NameCalculation {
static fullName(data) {
return data.firstName + " " + data.lastName;
}
}
class UserData {
constructor(firstName, lastName, email) {
this.firstName = firstName;
this.lastName = lastName;
this.email = email;
}
}
var userData = new UserData("John", "Doe", "john@doe.com");
NameCalculation.fullName(userData);
var authorData = new AuthorData("Isaac", "Asimov", 500);
NameCalculation.fullName(authorData);
// → "John Doe"</code></pre>
</div>
</div>
<div class="quoteblock">
<blockquote>
When code is separate from data, it is straightforward to reuse code in different contexts. This benefit is achievable both in FP and in OOP.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="benefit-2-code-can-be-tested-in-isolation">Benefit #2: Code can be tested in isolation</h3>
<div class="paragraph">
<p>A similar benefit is the ability to test code in an isolated context. When code is not separate from data, it is necessary to instantiate an object to test its methods. For instance, in order to test the <code>fullName</code> code that lives inside the <code>createAuthorObject</code> function, we need to instantiate an author object as this listing shows.</p>
</div>
<div id="test-instantiate-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var author = createAuthorObject("Isaac", "Asimov", 500);
author.fullName() === "Isaac Asimov"
// → true</code></pre>
</div>
</div>
<div class="paragraph">
<p>In this simple scenario, it is not overly burdensome. We only load (unnecessarily) the code for <code>isProlific</code>.
Although in a real-world situation, instantiating an object might involve complex and tedious setup.</p>
</div>
<div class="paragraph">
<p>In the DOP version, where <code>createAuthorData</code> and <code>fullName</code> are separate, we can create the data to be passed to <code>fullName</code> in isolation, testing <code>fullName</code> in isolation as well. The following listing provides an example.</p>
</div>
<div id="test-isolate-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var author = {
firstName: "Isaac",
lastName: "Asimov"
};
fullName(author) === "Isaac Asimov"
// → true</code></pre>
</div>
</div>
<div class="paragraph">
<p>If classes are used, it is only necessary to instantiate a data object. We do not need to load the code for <code>isProlific</code>, which lives in a separate class than <code>fullName</code>, in order to test <code>fullName</code>. The next listing lays out an example of this approach.</p>
</div>
<div id="test-isolate-oop-klipse-js" class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="klipse-javascript"><span></span>var data = new AuthorData("Isaac", "Asimov");
NameCalculation.fullName(data) === "Isaac Asimov"
// → true</code></pre>
</div>
</div>
<div class="quoteblock">
<blockquote>
Writing tests is easier when code is separated from data.
</blockquote>
</div>
</div>
<div class="sect2">
<h3 id="benefit-3-systems-tend-to-be-less-complex">Benefit #3: Systems tend to be less complex</h3>
<div class="paragraph">
<p>The third benefit of applying Principle #1 to our programs is that systems tend to be less complex. This benefit is the deepest one but also the one that is most subtle to explain.</p>
</div>
<div class="paragraph">
<p>The type of complexity I refer to is the one that makes systems hard to understand as defined in the paper, “Out of the Tar Pit” by Ben Moseley and Peter Marks (<a href="http://mng.bz/enzq" class="bare">http://mng.bz/enzq</a>). It has nothing to do with the complexity of the resources consumed by a program. Similarly, references to <em>simplicity</em> mean <em>not complex</em> (in other words, easy to understand).</p>
</div>
<div class="quoteblock">
<blockquote>
Complex in the context of this article means <em>hard to understand</em>.
</blockquote>
</div>
<div class="paragraph">
<p>Keep in mind that complexity and simplicity (like hard and easy) are not absolute but relative concepts. The complexity of two systems can be compared to determine whether system A is more complex (or simpler) than system B. When code and data are kept separate, the system tends to be easier to understand for two reasons:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><em>The scope of a data entity or a code entity is smaller than the scope of an entity that combines code and data.</em> Each entity is therefore easier to understand.</p>
</li>
<li>
<p><em>Entities of the system are split into disjoint groups: code and data.</em> Entities therefore have fewer relations to other entities.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>This insight is illustrated in a class diagram of our fictitious Library Management System, where code and data are mixed. It is not necessary to know the details of the classes of this system to see that the following diagram represents a complex system; this in the sense that it is hard-to-understand. The system is hard-to-understand because there are many dependencies between the entities that compose the system.</p>
</div>
<div id="lib-mgmt-class-diagram-overview-2" class="imageblock">
<div class="content">
<img src="/uml/chapter00/complex-class-relation.png" alt="complex class relation">
</div>
</div>
<div class="paragraph">
<p>The most complex entity of the system is the <code>Librarian</code> entity, which is connected via six relations to other entities. Some relations are data relations (association and composition), and some relations are code relations (inheritance and dependency). But in this design, the <code>Librarian</code> entity mixes code and data, and therefore, it has to be involved in both data and code relations. If each entity of the system is split into a code entity and a data entity <em>without making any further modification to the system</em>, the result is made of two disconnected parts:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The left part is made only of data entities and data relations: association and composition.</p>
</li>
<li>
<p>The right part is made only of code entities and code relations: dependency and inheritance.</p>
</li>
</ul>
</div>
<div id="lib-mgmt-simplified-class-diagram-2" class="imageblock">
<div class="content">
<img src="/uml/chapter00/data-code-relation.png" alt="data code relation">
</div>
</div>
<div class="paragraph">
<p>The new system, where code and data are separate, is easier to understand than the original system, where code and data are mixed. Thus, the data part of the system and the code part of the system can each be understood on its own.</p>
</div>
<div class="quoteblock">
<blockquote>
A system made of disconnected parts is less complex than a system made of a single part.
</blockquote>
</div>
<div class="paragraph">
<p>One could argue that the complexity of the original system, where code and data are mixed, is due to a bad design and that an experienced OOP developer would have designed a simpler system using smart design patterns. That is true, but in a sense, it is irrelevant. The point of Principle #1 is that a system made of entities that do not combine code and data tends to be simpler than a system made of entities that do combine code and data.</p>
</div>
<div class="paragraph">
<p>It has been said many times that <em>simplicity is hard</em>. According to the first principle of DOP, simplicity is easier to achieve when separating code and data.</p>
</div>
<div class="quoteblock">
<blockquote>
Simplicity is easier to achieve when code is separated from data.
</blockquote>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="cost-for-principle-1">Cost for Principle #1</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This section looks at the cost involved when we implement Principle #1. The price we pay in order to benefit from the separation between code and data is threefold:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>There is no control on what code can access what data.</p>
</li>
<li>
<p>There is no packaging.</p>
</li>
<li>
<p>Our systems are made from more entities.</p>
</li>
</ul>
</div>
<div class="sect2">
<h3 id="cost-1-there-is-no-control-on-what-code-can-access-what-data">Cost #1: There is no control on what code can access what data</h3>
<div class="paragraph">
<p>When code and data are mixed, it is easy to understand what pieces of code can access what kinds of data. For example, in OOP, the data is encapsulated in an object, which guarantees that the data is accessible only by the object’s methods. In DOP, data stands on its own. It is transparent if you like, and as a consequence, it can be accessed by any piece of code.</p>
</div>
<div class="paragraph">
<p>When refactoring the shape of some data, <em>every</em> place in our code that accesses this kind of data must be known. Moreover, without the application of <a href="/databook/2022/06/22/immutable-data.html">Principle #3: Immutable data</a>, accessing data by any piece of code is inherently unsafe. In that case, it would be hard to guarantee the validity of our data.</p>
</div>
</div>
<div class="sect2">
<h3 id="cost-2-there-is-no-packaging">Cost #2: There is no packaging</h3>
<div class="paragraph">
<p>One of the benefits of mixing code and data is that when you have an object in hand, it is a package that contains both the code (via methods) and the data (via members). As a consequence, it is easy to discover how to manipulate the data: you look at the methods of the class.</p>
</div>
<div class="paragraph">
<p>In DOP, the code that manipulates the data could be anywhere. For example, <code>createAuthorData</code> might be in one file and <code>fullName</code> in another file. This makes it difficult for developers to discover that the <code>fullName</code> function is available. In some situations, it could lead to wasted time and unnecessary code duplication.</p>
</div>
</div>
<div class="sect2">
<h3 id="cost-3-our-systems-are-made-from-more-entities">Cost #3: Our systems are made from more entities</h3>
<div class="paragraph">
<p>Let’s do simple arithmetic. Imagine a system made of <em>N</em> classes that combine code and data. When you split the system into code entities and data entities, you get a system made of 2<em>N</em> entities. This calculation is not accurate, however, because usually, when you separate code and data, the class hierarchy tends to get simpler as we need less class inheritance and composition. Therefore, the number of classes in the resulting system will probably be somewhere between <em>N</em> and 2<em>N</em>.</p>
</div>
<div class="paragraph">
<p>On one hand, when adhering to Principle #1, the entities of the system are simpler. On the other hand, there are more entities. This cost is mitigated by <a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>, which guides us to represent our data with generic data structures.</p>
</div>
<div class="quoteblock">
<blockquote>
When adhering to Principle #1, systems are made of simpler entities, but there are more of them.
</blockquote>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="summary-of-principle-1">Summary of Principle #1</h2>
<div class="sectionbody">
<div class="paragraph">
<p>DOP requires the separation of code from data. In OOP languages, aggregate code in static methods and data in classes with no methods. In FP languages, avoid hiding data in the lexical scope of functions.</p>
</div>
<div class="paragraph">
<p>Separating code from data comes at a price. It reduces control over what pieces of code access our data and can cause our systems to be made of more entities. But it’s worth paying the price because, when adhering to this principle, our code can be reused in different contexts in a straightforward way and tested in isolation. Moreover, a system made of separate entities for code and data tends to be easier to understand.</p>
</div>
<div class="paragraph">
<p>To follow this principle, we separate code from data in such a way that the code resides in functions whose behavior does not depend on data that is encapsulated in the function’s context.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principle-1-separate.png" alt="do principle 1 separate">
</div>
</div>
<div class="paragraph">
<p>Benefits include</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Code can be reused in different contexts.</p>
</li>
<li>
<p>Code can be tested in isolation.</p>
</li>
<li>
<p>Systems tend to be less complex.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The cost for implementing Principle #1 includes</p>
</div>
<div class="ulist">
<ul>
<li>
<p>No control on what code accesses which data.</p>
</li>
<li>
<p>No packaging.</p>
</li>
<li>
<p>More entities.</p>
</li>
</ul>
</div>
</div>
</div>Yehonathan SharvitThe first principle of Data-Oriented Programming (DOP) is a design principle that recommends a clear separation between code (behavior) and data. This may appear to be a FP principle, but in fact, one can adhere to it or break it either in FP or in OOP:Principles of Data-Oriented Programming2022-06-22T04:31:24+02:002022-06-22T04:31:24+02:00/dop/2022/06/22/principles-of-dop<div id="preamble">
<div class="sectionbody">
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<div class="paragraph">
<p>Now that my book is completed, I have a better understanding of the core principles of Data-Oriented Programming. This article is a rewrite of <a href="/2020/09/29/do-principles.html">my previous article</a> about DOP Principles from September 2020.</p>
</div>
<div class="paragraph">
<p>Data-oriented programming (DOP) is a programming paradigm aimed at simplifying the design and implementation of software systems, where information is at the center in systems such as frontend or backend web applications and web services, for example. Instead of designing information systems around software constructs that combine code and data (e.g., objects instantiated from classes), DOP encourages the <em>separation</em> of code from data. Moreover, DOP provides guidelines about how to represent and manipulate data.</p>
</div>
<div class="quoteblock">
<blockquote>
In DOP, data is treated as a first-class citizen.
</blockquote>
</div>
<div class="paragraph">
<p>The essence of DOP is that it treats data as a first-class citizen. It gives developers the ability to manipulate data inside a program with the same simplicity as they manipulate numbers or strings. Treating data as a first-class citizen is made possible by adhering to four core principles:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/databook/2022/06/22/separate-code-from-data.html">Principle #1</a>: Separating code (behavior) from data.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/generic-data-structures.html">Principle #2</a>: Representing data with generic data structures.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/immutable-data.html">Principle #3</a>: Treating data as immutable.</p>
</li>
<li>
<p><a href="/databook/2022/06/22/data-validation.html">Principle #4</a>: Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>When these four principles are combined, they form a cohesive whole. Systems built using DOP are simpler and easier to understand, so the developer experience is significantly improved.</p>
</div>
<div id="combined-together" class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principles-mind-map.png" alt="do principles mind map">
</div>
</div>
<div class="quoteblock">
<blockquote>
In a data-oriented system, code is separated from data. Data is represented with generic data structures that are immutable and have a separate schema.
</blockquote>
</div>
<div class="paragraph">
<p>Notice that DOP principles are language-agnostic. They can be adhered to (or broken) in</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Object-oriented programming (OOP) languages such as Java, C#, C++, etc.</p>
</li>
<li>
<p>Functional programming (FP) languages such as Clojure, OCaml, Haskell, etc.</p>
</li>
<li>
<p>Languages that support both OOP and FP such as JavaScript, Python, Ruby, Scala, etc.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>This series of articles succinctly illustrates how those principles could be applied or broken in JavaScript. Mentioned briefly are the benefits of adherence to each principle, and the costs paid to enjoy those benefits. This article also illustrates the principles of DOP via simple code snippets. Throughout my book <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7">Data-Oriented Programming</a>, the application of DOP principles to production information systems is explored in depth.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="principle-1-separate-code-from-data">Principle #1: Separate code from data</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Principle #1 is a design principle that recommends a clear separation between code (behavior) and data. This may appear to be a FP principle, but in fact, one can adhere to it or break it either in FP or in OOP:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Adherence to this principle in OOP means aggregating the code as methods of a static class.</p>
</li>
<li>
<p>Breaking this principle in FP means hiding state in the lexical scope of a function.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Also, this principle does not relate to the way data is represented. Data representation is addressed by Principle #2.</p>
</div>
<div class="paragraph">
<p>This principle is explored further in <a href="/databook/2022/06/22/separate-code-from-data.html">Separate code from data</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="principle-2-represent-data-with-generic-data-structures">Principle #2: Represent data with generic data structures</h2>
<div class="sectionbody">
<div class="paragraph">
<p>When adhering to Principle #1, code is separated from data. DOP is not opinionated about the programming constructs to use for organizing the code, but it has a lot to say about how the data should be represented. This is the theme of Principle #2.</p>
</div>
<div class="paragraph">
<p>The most common generic data structures are maps (a.k.a. dictionaries) and arrays (or lists). But other generic data structures (e.g., sets, trees, and queues) can be used as well. Principle #2 does not deal with the mutability or the immutability of the data. That is the theme of Principle #3.</p>
</div>
<div class="paragraph">
<p>This principle is explored further in <a href="/databook/2022/06/22/generic-data-structures.html">Represent data with generic data structures</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="principle-3-data-is-immutable">Principle #3: Data is immutable</h2>
<div class="sectionbody">
<div class="paragraph">
<p>With data separated from code and represented with generic data structures, how are changes to the data managed? DOP is very strict on this question. Mutation of data is not allowed! In DOP, changes to data are accomplished by creating new versions of the data. The <em>reference</em> to a variable may be changed so that it refers to a new version of the data, but the <em>value</em> of the data itself must never change.</p>
</div>
<div class="paragraph">
<p>This principle is explored further in <a href="/databook/2022/06/22/immutable-data.html">Data is immutable</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="principle-4-separate-data-schema-from-data-representation">Principle #4: Separate data schema from data representation</h2>
<div class="sectionbody">
<div class="paragraph">
<p>With data separated from code and represented with generic and immutable data structures, now comes the question of how do we express the shape of the data? In DOP, the expected shape is expressed as a data schema that is kept separated from the data itself. The main benefit of Principle #4 is that it allows developers to decide which pieces of data should have a schema and which pieces of data should not.</p>
</div>
<div class="paragraph">
<p>This principle is explored further in <a href="/databook/2022/06/22/data-validation.html">Separate data schema from data representation</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="conclusion">Conclusion</h2>
<div class="sectionbody">
<div class="paragraph">
<p>DOP simplifies the design and implementation of information systems by treating data as a first-class citizen. This is made possible by adhering to four language agnostic core principles:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Separating code from data.</p>
</li>
<li>
<p>Representing application data with generic data structures.</p>
</li>
<li>
<p>Treating data as immutable.</p>
</li>
<li>
<p>Separating data schema from data representation.</p>
</li>
</ul>
</div>
<div id="core-principles" class="imageblock">
<div class="content">
<img src="/uml/chapter00/do-principles-mind-map.png" alt="do principles mind map">
</div>
</div>
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
</div>
</div>Yehonathan SharvitThis article is an excerpt from my book about Data-Oriented Programming. More excerpts are available on my blog.Reading the present moment2022-01-17T00:45:32+01:002022-01-17T00:45:32+01:00/databook/2022/01/17/reading-the-present-moment<div class="paragraph">
<p><em>This is an experiment I am doing about introducing a bit of self-referential stuff in Chapter 13 of "Data-Oriented Programming.</em></p>
</div>
<div class="paragraph">
<p><em>I was inspired by the "Gödel, Escher, Bach" masterpiece. Not sure yet, if it will make it into the official version of the book though. It depends on your feedback.</em></p>
</div>
<div class="paragraph">
<p><em>Throughout the book, Joe — a senior Clojure developer — reveals the secrets of Data-Oriented Programming to Theo and Dave — two fellow developers — who get quite excited about this new paradigm.</em></p>
</div>
<div class="paragraph">
<p><em>In Chapter 13, Dave tests a piece of code he wrote using as a example the book "Data-Oriented Programming" written by your servant.</em></p>
</div>
<div class="listingblock">
<div class="content">
<pre class="pygments highlight"><code data-lang="javascript"><span></span><span class="tok-kd">var</span> <span class="tok-nx">yehonathan</span> <span class="tok-o">=</span> <span class="tok-p">{</span>
<span class="tok-s2">"name"</span><span class="tok-o">:</span> <span class="tok-s2">"Yehonathan Sharvit"</span><span class="tok-p">,</span>
<span class="tok-s2">"bookIsbns"</span><span class="tok-o">:</span> <span class="tok-p">[</span><span class="tok-s2">"9781617298578"</span><span class="tok-p">]</span>
<span class="tok-p">};</span>
<span class="tok-nx">Author</span><span class="tok-p">.</span><span class="tok-nx">myName</span><span class="tok-p">(</span><span class="tok-nx">yehonathan</span><span class="tok-p">,</span> <span class="tok-s2">"html"</span><span class="tok-p">);</span>
<span class="tok-c1">// → "<i>Yehonathan Sharvit</i>"</span></code></pre>
</div>
</div>
<div class="paragraph">
<p><em>And that’s how the self-referential fun begins…​</em></p>
</div>
<div class="paragraph">
<p><em>Please read this article on a device with a wide screen, like a desktop or a tablet. I don’t think it renders well on a mobile phone.</em></p>
</div>
<div class="paragraph">
<p>When Theo comes to Dave’s desk to review his implementation of the "list of authors" feature, he asks him about the author that appears in the test of <code>Author.myName</code>.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Who is Yehonathan Sharvit?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I don’t really know. The name appeared when I googled for "Data-Oriented Programming" yesterday. He wrote a book on the topic. I thought it would be cool to use its ISBN in my test.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Does his book present DOP in a similar way to what Joe taught us?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I don’t know. I guess I’ll discover when I receive the print book I ordered.</p>
</div>
<div class="paragraph">
<p>A few days later, Dave walks to Theo’s cube holding a package. Dave opens the package and they take a look at the cover together.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Wow, that’s-- that’s…​ odd. The woman on the cover - she’s so familiar. I could swear she’s the girl my grandparents knew from this Greek island called Santorini. My grandparents were born there, speak often of their childhood friend and have a photo of her. But how could a girl from their little island wind up on the cover of this book?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: That’s so cool!</p>
</div>
<div class="paragraph">
<p>Dave opens the book with Theo looking over his shoulder. They scan the table of contents.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: It looks like this books covers all the same topics Joe taught us.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: This is great!</p>
</div>
<div class="paragraph">
<p>Dave leafs through a few random sections. Hi attention is caught by a bit of dialog.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Theo, this is so strange!</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: What?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: The characters in Sharvit’s book have the same names as ours!</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Let me see…​</p>
</div>
<div class="paragraph">
<p>Theo turns to a page from the first chapter. He and Dave read this passage side by side.</p>
</div>
<div class="sidebarblock">
<div class="content">
<div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 1</strong></p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Hey Dave! How’s it going?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Today? Not great. I’m trying to fix a bug in my code! I can’t understand why the state of my objects always changes. I’ll figure it out though, I’m sure. How’s your day going?</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: I just finished the design of a system for a new customer.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Cool! Would it be OK for me to see it? I’m trying to improve my design skills.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Sure! I have the diagram on my desk. We can take a look now if you like.</p>
</div>
</div>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I remember this situation. It was around a year ago just a few weeks after I had joined Albatross.</p>
</div>
<div class="paragraph">
<p>Theo’s face turns pale.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: I don’t feel well.</p>
</div>
<div class="paragraph">
<p>Theo gets up to splash cold water on his face. When he comes back, still pale, but in better control of his emotions, he tries to remember the situation described in the first chapter of the book.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Was it when I showed you my design for Klafim prototype?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Exactly! I was quite impressed by your class hierachy diagrams.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Oh no! Don’t remind me of that time. The long hours of work on such a complex OOP system gave me nightmares.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I remember it as a fun period. Every week I was learning a new technology: GraphQL, Elasticsearch, DataDog, Bigtable, Spring, Express…​</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Luckily, I met Joe a few days later.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Apropos Joe, you never told me exactly how you met him.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Well now you’ll know everything. The meeting is told quite accurately at the beginning of Chapter 2.</p>
</div>
<div class="paragraph">
<p>Dave reads a few lines in the beginning of Chapter 2.</p>
</div>
<div class="sidebarblock">
<div class="content">
<div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 2</strong></p>
</div>
<div class="paragraph">
<p>The next morning, Theo asks on Hacker News and on Reddit for ways to reduce system complexity and build flexible systems. Some folks mention using different programming languages, others talk about advanced design patterns. Finally, Theo’s attention gets captured by a comment from a user named Joe who mentions "Data-Oriented programming" and claims that its main goal is to reduce system complexity. Theo has never heard this term before. Out of curiosity he decides to contact Joe by email.</p>
</div>
<div class="paragraph">
<p>What a coincidence! Joe lives in San Francisco too. Theo invites him to a meeting in his office.</p>
</div>
<div class="paragraph">
<p>Joe is a 40-year old developer. He’d been a Java developer for nearly decade before adopting Clojure around 7 years ago.</p>
</div>
<div class="paragraph">
<p>When Theo tells Joe about the Library Management System he designed and built, and about his struggles to adapt to changing requirements, Joe is not surprised.</p>
</div>
</div>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: The book doesn’t say if it was on Hacker News or on Reddit that Joe you exchanged with Joe.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: I remember it very well: It was on Reddit. In the "r/clojure" community.</p>
</div>
<div class="paragraph">
<p>While they talk, Dave leafs through the pages of the book, when he comes across a curious passage from Chapter 15…​</p>
</div>
<div class="sidebarblock">
<div class="content">
<div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 15</strong></p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I get that. But what happens if the code of the function modifies the data that we are writing. Will we write the original data to the file, or the modified data?</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: I’ll let you think about that while I get a cup of coffee at the <strong>museum</strong> coffee shop. Would you like one?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Yes, an espresso please.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: I have a weird sensation of <em>déjà lu</em>.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Me too.</p>
</div>
</div>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Do you know what <em>déjà lu</em> means?</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: No. But it sounds like it’s related to déjà vu.</p>
</div>
<div class="paragraph">
<p>Dave and Theo sit quietly, pondering the meaning of "déjà lu" and the bigger puzzle of this weird book.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: That’s it! I think I got the hang of it.</p>
</div>
<div class="paragraph">
<p>Dave shows Theo the result from Google translate with the "Detect language" option activated.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: In French, "déjà lu" means "already read".</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Do you think that the author is French?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Probably. That would explain some odd turns of phrases I’ve noticed here and there in the way the characters express themselves.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: But of course! At least we have found a point on which we are not identical to the characters in this book.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Anyway, A <em>déjà lu</em> must be when you live a situation that you have already read in a book.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: But I don’t think we’ve ever been together at a museum!</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Me neither. Could this book be telling not only the past but also the future?</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: A future that we will already know when it will happen since we are now reading it.</p>
</div>
<div class="paragraph">
<p>Dave and Theo together:</p>
</div>
<div class="paragraph">
<p> — A déjà lu!</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: This book tells our past and our future. I wonder if it also tells our present.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: What chapter do you think we would be at the moment?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.</p>
</div>
<div class="paragraph">
<p>Dave slowly turns the pages of the book, until he finds the line that tells the present moment.</p>
</div>
<div class="sidebarblock">
<div class="content">
<div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 13</strong></p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: This book tells our past and our future. I wonder if it also tells our present.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: What chapter do you think we would be at the moment?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.</p>
</div>
<div class="paragraph">
<p>Dave slowly turns the pages of the book, until he finds the line that tells the present moment.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 100%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 13</strong></p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: This book tells our past and our future. I wonder if it also tells our present.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: What chapter do you think we would be at the moment?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.</p>
</div>
<div class="paragraph">
<p>Dave slowly turns the pages of the book, until he finds the line that tells the present moment.</p>
</div>
<table class="tableblock frame-none grid-all stretch">
<colgroup>
<col style="width: 9.0909%;">
<col style="width: 90.9091%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div class="content"><div class="paragraph">
<p><strong>Data-Oriented Programming: Chapter 13</strong></p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: This book tells our past and our future. I wonder if it also tells our present.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: What chapter do you think we would be at the moment?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Let’s see. At the end of Chapter 12, there’s a beautiful drawing of the JSON schema cheatsheet we made together last week. It means, that we should now be in Chapter 13.</p>
</div>
<div class="paragraph">
<p>Dave slowly turns the pages of the book, until he finds the line that tells the present moment.</p>
</div></div></td>
</tr>
</tbody>
</table></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: Dave! This is freaking me out! I think we should close this book immediately and forget all about it.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I can’t. I’m too curious to discover my future.</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: You’ll have to do it without me. Joe told us many times we should never mess up with the state of a system.</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: Wait! It’s true that Joe taught us the merits of immutability. But that only concerns the past state of a system. He never said we didn’t have the right to mutate our future!</p>
</div>
<div class="paragraph">
<p><strong>THEO</strong>: You mean that reading beyond Chapter 13 won’t necessarily lock us in a predefined scenario?</p>
</div>
<div class="paragraph">
<p><strong>DAVE</strong>: I hope so!</p>
</div>
<div class="paragraph">
<p>Hoping to stay in control of their destiny, Theo and Dave start reading Chapter 14 of "Data-Oriented Programming".</p>
</div>
<div class="paragraph">
<p><em>Please share your thoughts about this self-referential stuff by replying to this <a href="https://twitter.com/viebel/status/1482899756791836674">tweet</a>.</em></p>
</div>
<div class="paragraph">
<p><em>Did you enjoy this self-referential stuff in Chapter 13?</em></p>
</div>
<div class="paragraph">
<p><em>Do you think it’s a good idea to include this self-referential stuff in the book?</em></p>
</div>
<div class="paragraph">
<p><em>How would you make it better?</em></p>
</div>Yehonathan SharvitThis is an experiment I am doing about introducing a bit of self-referential stuff in Chapter 13 of "Data-Oriented Programming.A hundred things I learned writing my first technical book “Data-Oriented Programming”2021-12-19T05:01:22+01:002021-12-19T05:01:22+01:00/book/2021/12/19/100-things-I-learned-with-data-oriented-programming<ol>
<li>Writing a technical book is much harder than writing blog posts.</li>
<li>Writing a blog post is like running a sprint while writing a book is like running a marathon.</li>
<li>Writing my first technical book without a publisher would have been a MISSION: IMPOSSIBLE!</li>
<li>Each piece of the book content must be clear and interesting. Each part, each chapter, each section, each paragraph, each sentence.</li>
<li>“Clear” is more important that “interesting”. If something is not clear to your reader, it cannot be interesting for them.</li>
<li>A possible way to make things clear is to go from concrete to abstract.</li>
<li>A possible way to make things interesting is to teach the material as a story with fiction characters and a bit of drama.</li>
<li>The “why” is more important than the “what”.</li>
<li>The “what” is more important than the “how”.</li>
<li>An average writer makes the reader think the author is smart. A good writer makes the reader think the reader is smart.</li>
<li>A technical book is written for MQRs (Minimal Qualified Readers).</li>
<li>Figuring out the qualifications of your MQRs (Minimal Qualified Readers) is important as it allows you to assume what knowledge your readers already have.</li>
<li>It’s hard to figure out the qualifications of your MQRs (Minimal Qualified Readers).</li>
<li>Checking book sales could be addictive.</li>
<li>Making a good Table of Contents is crucial as it is the first part of the book potential readers will encounter.</li>
<li>Making a good Table of Contents is hard as you need to figure out what you really want to talk about.</li>
<li>The Table of Contents might evolve a bit as you write your book.</li>
<li>You should resist the temptation to write the first chapter before the Table of Contents is ready.</li>
<li>It’s not necessary to write chapters in order. But it’s easier.</li>
<li>Never assume that your readers will read the next chapter only because they have enjoyed the previous chapter.</li>
<li>You should always convince your readers why what you are teaching is important and relevant for them.</li>
<li>Before writing a chapter, you should formulate to yourself what is the main objective of the chapter.</li>
<li>If a chapter has two main objectives, it’s a sign that you should split it into two chapters.</li>
<li>A chapter should be treated like a piece of software. You should resist the temptation of writing the chapter contents without a plan.</li>
<li>A possible way to make things interesting is to use concrete examples.</li>
<li>A possible way to make things clear inside a chapter is to start with the easy stuff and increase the level of difficulty as the chapter goes on.</li>
<li>Do not hesitate to highlight sentences that convey an important message.</li>
<li>It’s OK to engage in writing a technical book without mastering every topic you want to cover in your book.</li>
<li>Writing technical book involves a decent amount of research even if you consider yourself as an expert in the field.</li>
<li>Finding attractive but accurate titles to book chapters is an art.</li>
<li>You can learn a lot from a failed attempt to write a book, provided that you put your ego aside.</li>
<li>If you try to write a Wikipedia article about the topic of your book before it is mentioned by other sources, it will be rejected.</li>
<li>It’s possible to write a technical book while keeping your day job as a programmer, provided that you are willing to wake up early or sleep late.</li>
<li>Writing a technical book takes between a year and two.</li>
<li>Don’t rush! Enjoy the journey…</li>
<li>It makes lot of sense to use a source control software for your manuscript.</li>
<li>AsciiDoc rocks!</li>
<li>PlantUML rocks!</li>
<li>NeoVim rocks!</li>
<li>Using a tool - like PlantUML - that generates diagrams from text makes it easy to refactor multiple diagrams at once (e.g rename a label, change a color).</li>
<li>People on Reddit could feel hurt by opinions that take them out of their comfort zone.</li>
<li>On Reddit, when people feel hurt, they could become violent.</li>
<li>Being mentored by an experienced writer is a blessing.</li>
<li>If you are lucky enough to be mentored by an experienced writer, ask them to be hard with you. That’s how you are going to improve your book!</li>
<li>A good technical reviewer is a representative of your MQRs (Minimal Qualified Readers). They can tell you upfront is something is going to be unclear to your readers.</li>
<li>You should make sure your readers will never frown while reading your book.</li>
<li>A project manager that pays attention to the details is important.</li>
<li>Your publisher is your partner.</li>
<li>You could make more dollars per copy by self-publishing but you’d probably sell much less copies.</li>
<li>Asking early feedback from external reviewers is a great source of improvement.</li>
<li>Releasing an early version of the book (approx. when the first third is ready) allows you to find out if the topic of your book is interesting.</li>
<li>Finding a good book title is hard.</li>
<li>Finding a good book subtitle is even harder.</li>
<li>You need to be very careful not to hurt the sensitivity of any of your readers.</li>
<li>Having your book featured on HackerNews home page does not mean selling lots of copies.</li>
<li>Twitter is a great medium to share ideas from your book.</li>
<li>Writing a book could sometimes take you to flow.</li>
<li>My real motivation for writing a book was neither to be famous nor to be rich. I only wanted to accomplish a child’s dream.</li>
<li>It’s hard to find your voice.</li>
<li>Once you have found the your voice, the writing flows much better.</li>
<li>Usually readers stop reading after reading the middle of the book. If you want them to read the second half of your book, you need to find a way to hook them.</li>
<li>A possible way to hook your readers is to tell a story.</li>
<li>Inspiration is not linear. It’s OK to stop writing for a couple of hours.</li>
<li>Motivation is not linear. It’s OK to stop writing for a couple of weeks.</li>
<li>Be open to critics - even when they hurt your ego.</li>
<li>The more you write, the more you like it.</li>
<li>It’s safe to assume that every developer can read JavaScript.</li>
<li>It’s a great feeling to mention the work of other authors.</li>
<li>You should make sure that each and every code snippet - that appears in your book - runs as expected.</li>
<li>Invoking “it’s so obvious I don’t need to explain it” is not an acceptable argument.</li>
<li>Writing your teaching materials as a dialogue between an imaginary expert and a imaginary novice is a very useful process in order to figure out what questions your materials might raise in your reader’s mind.</li>
<li>Sometimes the questions that an imaginary novice would ask about the stuff you teach would be tough. Don’t ignore them. It’s an opportunity to make your book better.</li>
<li>Rewriting a chapter from scratch because you forgot to save your work could be a blessing as writing from scratch might lead to a material of higher quality.</li>
<li>Writing in a coffee shop makes me feel like a famous author, but in fact I am much more productive at home.</li>
<li>Writing a preface - after the whole manuscript is ready - is really a pleasure!</li>
<li>You should think about the way your contents is going to appear on the paper. Use headlines, highlights, call outs and diagrams to make sure it doesn’t look boring.</li>
<li>Resist the temptation to impress your readers with “cool stuff” if you think it might confuse them.</li>
<li>Working on your book is a good reason to wake up early. Sometimes, before sunrise (even in summer!).</li>
<li>Include at least 2 or 3 diagrams in every chapter. It makes the material fun to read and easier to grasp.</li>
<li>Draw your diagrams on a sheet of paper before using drawing software.</li>
<li>It’s OK to use colors in diagrams for the online version of the book. But remember that the print version of the book will be not be in color.</li>
<li>Mind maps are a great visualization tool. Use them smartly.</li>
<li>When a section is more difficult to read than the others, let your readers know about it.</li>
<li>When a section is more difficult to read than the others, make it skippable.</li>
<li>It’s OK - from time to time - to copy-paste a diagram in order to save from your readers the need to flip back.</li>
<li>Asking a friend or a colleague to read your work in progress is not a productive idea. The best feedback comes from people you don’t know.</li>
<li>Brainstorming with a friend or a colleague about a difficulty you encounter might be a productive idea.</li>
<li>Throwing away some (good) ideas is sometimes necessary. Not easy but necessary.</li>
<li>When you are blocked in the middle of a chapter, it might be a sign that you need to rethink the chapter.</li>
<li>When you are blocked in the middle of a chapter, it might be a sign that you need to rest and come back later.</li>
<li>Adapting parts of your book to blog posts could be a good idea. But you need to resist the temptation of copy-pasting verbatim as the blog posts will be without the context of the book.</li>
<li>It feels great when someone with lots of followers tweets about the fun they had reading your book.</li>
<li>Don’t worry if your English is not perfect. Your manuscript will be proofread later.</li>
<li>“Not being a native English speaker” is not an excuse for your lack of clarity.</li>
<li>Writing an appendix is much easier than writing a chapter.</li>
<li>Using humour in a technical book is possible. Hopefully, it’s well appreciated.</li>
<li>You should write the chapter introduction after all the other parts of the chapter are written.</li>
<li>Getting positive feedback - even from people who are easily enthusiastic - feels good.</li>
<li>Front matter is the last part an author writes.</li>
<li>Writing a hundred things you learned from writing a technical book is not as difficult as it may seem.</li>
</ol>{"image"=>"authorimage.jpg", "greetings"=>"Hi there! My name is Yehonathan Sharvit. I'm a software developer, author and speaker. My passion is to make interesting things easy to understand. I hope you will enjoy the articles."}Writing a technical book is much harder than writing blog posts. Writing a blog post is like running a sprint while writing a book is like running a marathon. Writing my first technical book without a publisher would have been a MISSION: IMPOSSIBLE! Each piece of the book content must be clear and interesting. Each part, each chapter, each section, each paragraph, each sentence. “Clear” is more important that “interesting”. If something is not clear to your reader, it cannot be interesting for them. A possible way to make things clear is to go from concrete to abstract. A possible way to make things interesting is to teach the material as a story with fiction characters and a bit of drama. The “why” is more important than the “what”. The “what” is more important than the “how”. An average writer makes the reader think the author is smart. A good writer makes the reader think the reader is smart. A technical book is written for MQRs (Minimal Qualified Readers). Figuring out the qualifications of your MQRs (Minimal Qualified Readers) is important as it allows you to assume what knowledge your readers already have. It’s hard to figure out the qualifications of your MQRs (Minimal Qualified Readers). Checking book sales could be addictive. Making a good Table of Contents is crucial as it is the first part of the book potential readers will encounter. Making a good Table of Contents is hard as you need to figure out what you really want to talk about. The Table of Contents might evolve a bit as you write your book. You should resist the temptation to write the first chapter before the Table of Contents is ready. It’s not necessary to write chapters in order. But it’s easier. Never assume that your readers will read the next chapter only because they have enjoyed the previous chapter. You should always convince your readers why what you are teaching is important and relevant for them. Before writing a chapter, you should formulate to yourself what is the main objective of the chapter. If a chapter has two main objectives, it’s a sign that you should split it into two chapters. A chapter should be treated like a piece of software. You should resist the temptation of writing the chapter contents without a plan. A possible way to make things interesting is to use concrete examples. A possible way to make things clear inside a chapter is to start with the easy stuff and increase the level of difficulty as the chapter goes on. Do not hesitate to highlight sentences that convey an important message. It’s OK to engage in writing a technical book without mastering every topic you want to cover in your book. Writing technical book involves a decent amount of research even if you consider yourself as an expert in the field. Finding attractive but accurate titles to book chapters is an art. You can learn a lot from a failed attempt to write a book, provided that you put your ego aside. If you try to write a Wikipedia article about the topic of your book before it is mentioned by other sources, it will be rejected. It’s possible to write a technical book while keeping your day job as a programmer, provided that you are willing to wake up early or sleep late. Writing a technical book takes between a year and two. Don’t rush! Enjoy the journey… It makes lot of sense to use a source control software for your manuscript. AsciiDoc rocks! PlantUML rocks! NeoVim rocks! Using a tool - like PlantUML - that generates diagrams from text makes it easy to refactor multiple diagrams at once (e.g rename a label, change a color). People on Reddit could feel hurt by opinions that take them out of their comfort zone. On Reddit, when people feel hurt, they could become violent. Being mentored by an experienced writer is a blessing. If you are lucky enough to be mentored by an experienced writer, ask them to be hard with you. That’s how you are going to improve your book! A good technical reviewer is a representative of your MQRs (Minimal Qualified Readers). They can tell you upfront is something is going to be unclear to your readers. You should make sure your readers will never frown while reading your book. A project manager that pays attention to the details is important. Your publisher is your partner. You could make more dollars per copy by self-publishing but you’d probably sell much less copies. Asking early feedback from external reviewers is a great source of improvement. Releasing an early version of the book (approx. when the first third is ready) allows you to find out if the topic of your book is interesting. Finding a good book title is hard. Finding a good book subtitle is even harder. You need to be very careful not to hurt the sensitivity of any of your readers. Having your book featured on HackerNews home page does not mean selling lots of copies. Twitter is a great medium to share ideas from your book. Writing a book could sometimes take you to flow. My real motivation for writing a book was neither to be famous nor to be rich. I only wanted to accomplish a child’s dream. It’s hard to find your voice. Once you have found the your voice, the writing flows much better. Usually readers stop reading after reading the middle of the book. If you want them to read the second half of your book, you need to find a way to hook them. A possible way to hook your readers is to tell a story. Inspiration is not linear. It’s OK to stop writing for a couple of hours. Motivation is not linear. It’s OK to stop writing for a couple of weeks. Be open to critics - even when they hurt your ego. The more you write, the more you like it. It’s safe to assume that every developer can read JavaScript. It’s a great feeling to mention the work of other authors. You should make sure that each and every code snippet - that appears in your book - runs as expected. Invoking “it’s so obvious I don’t need to explain it” is not an acceptable argument. Writing your teaching materials as a dialogue between an imaginary expert and a imaginary novice is a very useful process in order to figure out what questions your materials might raise in your reader’s mind. Sometimes the questions that an imaginary novice would ask about the stuff you teach would be tough. Don’t ignore them. It’s an opportunity to make your book better. Rewriting a chapter from scratch because you forgot to save your work could be a blessing as writing from scratch might lead to a material of higher quality. Writing in a coffee shop makes me feel like a famous author, but in fact I am much more productive at home. Writing a preface - after the whole manuscript is ready - is really a pleasure! You should think about the way your contents is going to appear on the paper. Use headlines, highlights, call outs and diagrams to make sure it doesn’t look boring. Resist the temptation to impress your readers with “cool stuff” if you think it might confuse them. Working on your book is a good reason to wake up early. Sometimes, before sunrise (even in summer!). Include at least 2 or 3 diagrams in every chapter. It makes the material fun to read and easier to grasp. Draw your diagrams on a sheet of paper before using drawing software. It’s OK to use colors in diagrams for the online version of the book. But remember that the print version of the book will be not be in color. Mind maps are a great visualization tool. Use them smartly. When a section is more difficult to read than the others, let your readers know about it. When a section is more difficult to read than the others, make it skippable. It’s OK - from time to time - to copy-paste a diagram in order to save from your readers the need to flip back. Asking a friend or a colleague to read your work in progress is not a productive idea. The best feedback comes from people you don’t know. Brainstorming with a friend or a colleague about a difficulty you encounter might be a productive idea. Throwing away some (good) ideas is sometimes necessary. Not easy but necessary. When you are blocked in the middle of a chapter, it might be a sign that you need to rethink the chapter. When you are blocked in the middle of a chapter, it might be a sign that you need to rest and come back later. Adapting parts of your book to blog posts could be a good idea. But you need to resist the temptation of copy-pasting verbatim as the blog posts will be without the context of the book. It feels great when someone with lots of followers tweets about the fun they had reading your book. Don’t worry if your English is not perfect. Your manuscript will be proofread later. “Not being a native English speaker” is not an excuse for your lack of clarity. Writing an appendix is much easier than writing a chapter. Using humour in a technical book is possible. Hopefully, it’s well appreciated. You should write the chapter introduction after all the other parts of the chapter are written. Getting positive feedback - even from people who are easily enthusiastic - feels good. Front matter is the last part an author writes. Writing a hundred things you learned from writing a technical book is not as difficult as it may seem.Data-Oriented Programming: A link in the chain of programming paradigms2021-12-10T05:45:32+01:002021-12-10T05:45:32+01:00/databook/2021/12/10/dop-link<div id="preamble">
<div class="sectionbody">
<div>
<p>
This article is an excerpt from my <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7"> book </a> about <b>Data-Oriented Programming</b>.
</p>
<p>
More excerpts are available on my <a href="/data-oriented-programming-book.html">blog</a>.
</p>
<br>
</div>
<div class="paragraph">
<p>Data-Oriented Programming is not an invention. It has its <strong>origins</strong> in the 1950s and the invention of LISP and is based on a set of <strong>best practices</strong> that can be found in both Functional Programming and Object-Oriented Programming. However, this paradigm has only been applicable in production systems at scale since the 2010s and the implementation of <strong>efficient</strong> persistent data structures.</p>
</div>
<div class="paragraph">
<p>This article traces the major <strong>ideas</strong> and <strong>discoveries</strong> which, over the years, have allowed the emergence of DOP.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="timeline">Timeline</h2>
<div class="sectionbody">
<div class="imageblock">
<div class="content">
<img src="/uml/dop-timeline.png" alt="dop timeline">
</div>
</div>
<div class="sect3">
<h4 id="1958-lisp">1958: LISP</h4>
<div class="paragraph">
<p>In LISP, John McCarthy has the ingenious idea to represent <strong>data</strong> as <strong>generic immutable lists</strong> and to invent a language that makes it very <strong>natural</strong> to create lists and to access any part of a list. That’s the reason why LISP stands for LISt Processing.</p>
</div>
<div class="paragraph">
<p>In as sense, LISP lists are the ancestors of JavaScript object literals. The idea that it makes sense to represent data with generic data structures (DOP Principle #2) definitely comes from LISP.</p>
</div>
<div class="paragraph">
<p>The main limitation of LISP lists is that when we update a list, we need to create a new version of it by cloning it and it has a negative impact on <strong>performances</strong> both in terms of CPU and memory.</p>
</div>
</div>
<div class="sect3">
<h4 id="1981-values-and-objects">1981: Values and Objects</h4>
<div class="paragraph">
<p>In a beautiful, short and easy-to-read paper named <a href="https://www.researchgate.net/publication/220177801_Values_and_Objects_in_Programming_Languages">Values and Objects in Programming Languages</a>, Bruce MacLennan clarifies the distinction between <strong>values</strong> and <strong>objects</strong>. In a nutshell:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Values are <strong>timeless</strong> abstractions for which the concepts of <strong>updating</strong>, <strong>sharing</strong> and <strong>instantiation</strong> have no meaning. For instance, numbers are values.</p>
</li>
<li>
<p>Objects exist in <strong>time</strong> and hence can be <strong>created</strong>, <strong>destroyed</strong>, <strong>copied</strong>, <strong>shared</strong> and <strong>updated</strong>. For instance, an employee in a human resource software system is an object.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The meaning of the term <em>object</em> in this paper is not exactly the same as in the context of Object-Oriented Programming.</p>
</div>
<div class="paragraph">
<p>The author explains why it’s much simpler to write code that deals with values than code that deals with objects.</p>
</div>
<div class="paragraph">
<p>This paper has been a source of inspiration for Data-Oriented Programming as it encourages us to implement our systems in such a way that most of our code deals with values.</p>
</div>
</div>
<div class="sect3">
<h4 id="2000-ideal-hash-trees">2000: Ideal Hash Trees</h4>
<div class="paragraph">
<p>Phil Bagwell invented a data structure called Hash Array Mapped Trie (HAMT). In his paper <a href="https://lampwww.epfl.ch/papers/idealhashtrees.pdf">Ideal Hash trees</a>, he used HAMT to implement hash maps with nearly ideal characteristics both in terms of <strong>computation</strong> and <strong>memory</strong> usage.</p>
</div>
<div class="paragraph">
<p>HAMT and Ideal hash trees are the foundation of <strong>efficient persistent data structures</strong>.</p>
</div>
</div>
<div class="sect3">
<h4 id="2006-out-of-the-tar-pit">2006: Out of the Tar Pit</h4>
<div class="paragraph">
<p>In <a href="https://www.semanticscholar.org/paper/Out-of-the-Tar-Pit-Moseley-Marks/41dc590506528e9f9d7650c235b718014836a39d">Out of the Tar Pit</a>, Ben Moseley and Peter Marks claim that <strong>complexity</strong> is the single major difficulty in the development of large-scale software systems. In the context of their paper, complexity means what make a system <strong>hard to understand</strong>.</p>
</div>
<div class="paragraph">
<p>The main insight of the authors is that most of the complexity of software systems in not essential but <strong>accidental</strong>: the complexity doesn’t come from the problem we have to solve but from the software constructs we use to solve the problem. They suggest various ways to <strong>reduce complexity</strong> of software systems.</p>
</div>
<div class="paragraph">
<p>In a sense, Data-Oriented Programming is a way to get us out of the tar pit.</p>
</div>
</div>
<div class="sect3">
<h4 id="2007-clojure">2007: Clojure</h4>
<div class="paragraph">
<p>Rich Hickey, an <strong>Object-Oriented</strong> Programming expert, invented <strong>Clojure</strong> to make it easier to develop information systems at scale. Rich Hickey likes to summarize Clojure core value with the phrase: <strong>"Just use maps!"</strong>. By maps, he means <strong>immutable</strong> maps to be manipulated <strong>efficiently</strong> by <strong>generic</strong> functions. Those maps were implemented using the data structures presented by Phil Bagwell in "Ideal Hash Trees".</p>
</div>
<div class="paragraph">
<p>Clojure has been the main source of inspiration for Data-Oriented Programming. In a sense, Data-Oriented Programming is a formalization of the underlying principles of Clojure and how to apply them in other programming languages.</p>
</div>
</div>
<div class="sect3">
<h4 id="2009-immutability-for-all">2009: Immutability for all</h4>
<div class="paragraph">
<p>Clojure’s <strong>efficient</strong> implementation of <strong>persistent data structures</strong> has been attractive for developers from other programming languages. In 2009, there were ported to Scala. Over the years, they have been <strong>ported</strong> to other <strong>programming languages</strong> either by organizations (like Facebook for Immutable.js) or by individual contributors (like Glen Peterson for Paguro in Java).</p>
</div>
<div class="paragraph">
<p>Nowadays, DOP is applicable in virtually any programming language!</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="dop-principles-as-best-practices">DOP principles as best practices</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The <a href="/2020/09/29/do-principles.html">principles of Data-Oriented programming are not new</a>. They come from <strong>best practices</strong> that are well-known among software developers from various programming languages. The <em>innovation</em> of Data-Oriented programming is the combination of those principles into a cohesive whole.</p>
</div>
<div class="paragraph">
<p>In this section, we put each one of the 4 DOP principles into its broader scope.</p>
</div>
<div class="sect3">
<h4 id="principle-1-separate-code-from-data">Principle #1: Separate code from data</h4>
<div class="paragraph">
<p>Separating code from data used to be the main point of <strong>tension</strong> between <strong>Object-Oriented</strong> Programming (OOP) and <strong>Functional</strong> Programming (FP). Traditionally, in OOP we <strong>encapsulate</strong> data together with code in <strong>stateful</strong> objects, while in FP we write <strong>stateless</strong> functions that receive data they manipulate as an <strong>explicit</strong> argument.</p>
</div>
<div class="paragraph">
<p>This tension has been reduced over the years as it is possible in FP to write stateful functions with data encapsulated in their <a href="https://en.wikipedia.org/wiki/Scope_computer_science">lexical scope</a>. Moreover, OOP languages like Java and C# have added support for <strong>anonymous functions</strong> (lambdas).</p>
</div>
</div>
<div class="sect3">
<h4 id="principle-2-represent-data-with-generic-data-structures">Principle #2: Represent data with generic data structures</h4>
<div class="paragraph">
<p>One of the main innovation of <strong>JavaScript</strong> when it was released in December 1995 was the <strong>easiness</strong> to create and manipulate hash maps via <strong>object literals</strong>. The increasing <strong>popularity</strong> of JavaScript over the years as a language used everywhere (frontend, backend, desktop) has influenced the developer community to represent data with hash maps when possible. It feels more natural in <strong>dynamically-typed</strong> programming languages, it is applicable also in <strong>statically-typed</strong> programming languages.</p>
</div>
</div>
<div class="sect3">
<h4 id="principle-3-data-is-immutable">Principle #3: Data is immutable</h4>
<div class="paragraph">
<p>Data immutability is considered as a best practice as it makes the behaviour of our program more <strong>predictable</strong>. For instance, in <a href="https://www.oreilly.com/library/view/effective-java/9780134686097">Effective Java</a>, Joshua Block mentions "Minimize mutability" as one of Java best practices.</p>
</div>
<div class="paragraph">
<p>There is a famous quote from Alan Kay - who is considered by many as the inventor of Object-Oriented Programming - about the value of immutability:</p>
</div>
<div class="quoteblock">
<blockquote>
<div class="paragraph">
<p>The last thing you wanted any programmer to do is mess with internal state even if presented figuratively. Instead, the objects should be presented as site of higher level behaviors more appropriate for use as dynamic components. (…​) It is unfortunate that much of what is called "object-oriented programming" today is simply old style programming with fancier constructs. Many programs are loaded with "assignment-style" operations now done by more expensive attached procedures.</p>
</div>
</blockquote>
</div>
<div class="paragraph">
<p>Unfortunately, until 2007 and the implementation of efficient persistent data structures in Clojure, immutability was not applicable for production applications at scale.</p>
</div>
<div class="paragraph">
<p>Nowadays, efficient persistent data structures are available in most programming languages.</p>
</div>
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top">Language</th>
<th class="tableblock halign-left valign-top">Library</th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">Java</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/GlenKPeterson/Paguro">Paguro</a></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">C#</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://docs.microsoft.com/en-us/archive/msdn-magazine/2017/march/net-framework-immutable-collections">Provided by the language</a></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">JavaScript</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://immutable-js.com/">Immutable.js</a></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">Python</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/tobgu/pyrsistent">Pyrsistent</a></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">Ruby</p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><a href="https://github.com/hamstergem/hamster">Hamster</a></p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>In addition to that, many languages provide support for <strong>read-only</strong> objects natively. Java added <a href="https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/lang/Record.html">record classes</a> in Java 14. C# introduced a <code>record</code> type in C# 9. There is a (<a href="https://github.com/tc39/proposal-record-tuple)">ECMAScript proposal) for supporting immutable records and tuples in JavaScript . Python 3.7 introduced https://docs.python.org/3/library/dataclasses.html[Immutable data classes</a>.</p>
</div>
</div>
<div class="sect3">
<h4 id="principle-4-separate-data-schema-from-data-representation">Principle #4: Separate data schema from data representation</h4>
<div class="paragraph">
<p>One of the more virulent <strong>critics</strong> against dynamically-typed programming languages used to be related to the lack of data validation. The answer that dynamically-typed languages used to give to this critics was that you trade data <strong>safety</strong> for data <strong>flexibility</strong>.</p>
</div>
<div class="paragraph">
<p>Since the development of <strong>data schema</strong> languages like <a href="https://json-schema.org/">JSON schema</a>, it is natural to validate data even when data is represented as hash maps.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="wrapping-up">Wrapping up</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In this article, we have explored the <strong>ideas</strong> that inspired Data-Oriented Programming and the <strong>discoveries</strong> that made it applicable in production systems at <strong>scale</strong> in most programming languages.</p>
</div>
</div>
</div>Yehonathan SharvitThis article is an excerpt from my book about Data-Oriented Programming. More excerpts are available on my blog.Polymorphism without objects via multimethods2021-10-04T01:54:21+02:002021-10-04T01:54:21+02:00/javascript/2021/10/04/multimethod<p><strong>Object-Oriented Programming</strong> is well known for allowing different classes to be called with the same interface, via a mechanism called <strong>polymorphism</strong>. It may seem that the only way to have polymorphism in a program is with objects. In fact, as we are going to see in this article it is possible to have <strong>polymorphism without objects</strong> via <strong>multimethods</strong>.</p>
<p>Moreover, multimethods provide more advanced polymorphism than OOP polymorphism as they support cases where the chosen implementation depends on several argument types (multiple dispatch) and even on the dynamic value of the arguments (dynamic dispatch).</p>
<p>This article covers:</p>
<ol>
<li>Mimicking objects with multimethods (Single dispatch)</li>
<li>Multimethods where implementations depend on several argument types (Multiple dispatch)</li>
<li>Multimethods where implementations depend dynamically on several arguments (Dynamic dispatch)</li>
</ol>
<h1 id="the-essence-of-polymorphism">The essence of polymorphism</h1>
<p>In OOP, <strong>polymorphism</strong> is about defining an <strong>interface</strong> and having <strong>different classes</strong> that implement the same interface in different ways.</p>
<p>Let’s illustrate polymorphism with an adaptation of the classic OOP polymorphism example: animal greetings. Let’s say that our animals are <strong>anthropomorphic</strong> and each of them has its own way to greet, by emitting its preferred sound and telling its name.</p>
<p><em>Anthropomorphism</em> is our first word that comes from the Greek: it comes from the Greek <em>ánthrōpos</em> that means <em>human</em> and <em>morphē</em> that means <em>form</em>.</p>
<p>In fact, it’s our second word that comes from the Greek. The first one was <em>polymorphism</em> coming from the Greek <em>polús</em> that means <em>many</em> and <em>morphē</em> that means <em>form</em>. Polymorphism is the ability of different objects to implement in different ways the same method.</p>
<p>In Java, for instance, we’d define a <code class="language-plaintext highlighter-rouge">IAnimal</code> interface with a <code class="language-plaintext highlighter-rouge">greet</code> method and each animal class would implement <code class="language-plaintext highlighter-rouge">greet</code> in its own way, like this:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">interface</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">();</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Dog</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Woof woof! My name is "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Cat</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Meow! I am "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Cow</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Moo! Call me "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>Now, let’s ask ourselves: what is the <strong>fundamental</strong> difference between OOP polymorphism and a <strong>naive switch statement</strong>?</p>
<p>Let me tell you what I mean by a naive switch statement. We could, as <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7">Data-Oriented programming</a> recommends, represent an animal with a <strong>map</strong> having two <strong>fields</strong> <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">type</code> and call a different piece of code depending on the value of <code class="language-plaintext highlighter-rouge">type</code>, like this:</p>
<pre><code class="language-klipse-eval-js">function greet(animal) {
switch (animal.type) {
case "dog":
console.log("Woof Woof! My name is: " + animal.name);
break;
case "cat":
console.log("Meow! I am: " + animal.name);
break;
case "cow":
console.log("Moo! Call me " + animal.name);
break;
};
}
</code></pre>
<p>It makes me think that we have not yet met our animals. For no further due, I am happy to present our heroes: Fido, Milo and Clarabelle.</p>
<p><img src="/assets/fido-milo-clarabelle.jpg" alt="Fido" /></p>
<pre><code class="language-klipse-eval-js">var myDog = {
"type": "dog",
"name": "Fido"
};
var myCat = {
"type": "cat",
"name": "Milo"
};
var myCow = {
"type": "cow",
"name": "Clarabelle"
};
</code></pre>
<p>The first difference between <strong>OOP polymorphism</strong> and our <strong>switch statement</strong> is that, if we pass an invalid map to the <code class="language-plaintext highlighter-rouge">greet</code> function, bad things will happen.</p>
<p>We could easily fix that by validating input data using <a href="/javascript/2021/09/30/data-validation-with-json-schema.html">JSON Schema</a></p>
<p>Another drawback of the switch statement approach is that when you want to <strong>modify</strong> the implementation of <code class="language-plaintext highlighter-rouge">greet</code> for a specific animal, you have to change the code that deals with all the animals, While in the OOP approach, we have to change only a specific animal class.</p>
<p>This could also be easily fixed by having a <strong>separate function</strong> for each animal, like this:</p>
<pre><code class="language-klipse-eval-js">function greetDog(animal) {
console.log("Woof Woof! My name is: " + animal.name);
}
function greetCat(animal) {
console.log("Meow! I am: " + animal.name);
}
function greetCow(animal) {
console.log("Moo! Call me " + animal.name);
}
function greet(animal) {
switch (animal.type) {
case "dog":
greetDog(animal);
break;
case "cat":
greetCat(animal);
break;
case "cow":
greetCow(animal);
break;
};
}
</code></pre>
<p>But what if you want to <strong>extend</strong> the functionality of greet and add a new animal?</p>
<p>Now, we got to the <strong>essence</strong> of polymorphism! With a switch statement, we cannot add a new animal without modifying the original code, while in OOP we can add a new class without having to modify the original code.</p>
<blockquote>
<p>The main benefit of polymorphism is that it makes the code easily extensible.</p>
</blockquote>
<p>Now, I have a surprise for you: We don’t need objects to make our code easily extensible. This is what we call: <strong>polymorphism without objects</strong>. And it is possible with <strong>multimethods</strong>.</p>
<h1 id="multimethods-with-single-dispatch">Multimethods with single dispatch</h1>
<p><strong>Multimethod</strong> is a software construct that provides <strong>polymorphism</strong> without the need for objects.</p>
<p>Multimethods are made of two pieces:</p>
<ol>
<li>A <strong>dispatch function</strong> that emits a <strong>dispatched value</strong></li>
<li>A set of <strong>methods</strong> that provide an <strong>implementation</strong> for each dispatched value.</li>
</ol>
<p>A dispatch function is similar to an interface in the sense that it defines the way the function needs to be called. But it goes beyond that as it also dispatches a value that differentiates between the different implementations.</p>
<p>Let me show illustrate how I would implement the animal greeting capabilities using a multimethod called <code class="language-plaintext highlighter-rouge">greet</code>. We need a dispatch function and 3 methods. Let’s call the dispatch function <code class="language-plaintext highlighter-rouge">greetDispatch</code>: it dispatches the animal type, either <code class="language-plaintext highlighter-rouge">"dog"</code>, <code class="language-plaintext highlighter-rouge">"cat"</code> or <code class="language-plaintext highlighter-rouge">"cow"</code>.</p>
<p>And each dispatch value is handled by a specific method:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">"dog"</code> by <code class="language-plaintext highlighter-rouge">greetDog</code></li>
<li><code class="language-plaintext highlighter-rouge">"cat"</code> by <code class="language-plaintext highlighter-rouge">greetCat</code></li>
<li><code class="language-plaintext highlighter-rouge">"cow"</code> by <code class="language-plaintext highlighter-rouge">greetCow</code>.</li>
</ul>
<p><img src="/assets/multimethod-animal.png" alt="multi-single-dispatch" /></p>
<p>In the diagram, there is an arrow between animal and the methods in addition to the arrow between animal and the dispatch function because the arguments of a multimethod are passed to the dispatch function and to the methods.</p>
<p>For now, our multimethod receives a single argument. But in the next section, it will receive several arguments.</p>
<p>Let’s see how a multimethod looks like in terms of code. For that, we need a library. For instance, in JavaScript using a library named <a href="https://github.com/caderek/arrows/tree/master/packages/multimethod">arrows/multimethod</a>, we call <code class="language-plaintext highlighter-rouge">multi</code> to create a multimethod and <code class="language-plaintext highlighter-rouge">method</code> to create a method.</p>
<p>We start the definition of a multimethod by declaring its <strong>dispatch function</strong>. In our case, the dispatch function emits the type of the animal as the dispatched value:</p>
<pre><code class="language-klipse-eval-js">var greet = multi(animal => animal.type);
</code></pre>
<p>Then, we need a method for each dispatch value. In our case, we’ll have <code class="language-plaintext highlighter-rouge">greetDog</code> for dogs, <code class="language-plaintext highlighter-rouge">greetCat</code> for cats and <code class="language-plaintext highlighter-rouge">greetCow</code> for cows:</p>
<pre><code class="language-klipse-eval-js">function greetDog(animal) {
console.log("Woof woof! My name is " + animal.name);
}
greet = method("dog", greetDog)(greet);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetCat(animal) {
console.log("Meow! I am " + animal.name);
}
greet = method("cat", greetCat)(greet);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetCow(animal) {
console.log("Moo! Call me " + animal.name);
}
greet = method("cow", greetCow)(greet);
</code></pre>
<p>It is important to notice that each method declaration could live in its own file. That’s how multimethods provide <strong>extensibility</strong>: We are free to add new methods without having to modify the original implementation.</p>
<blockquote>
<p>Method declarations are decoupled from the multimethod initialization</p>
</blockquote>
<p>Under the hood, the <code class="language-plaintext highlighter-rouge">arrows/multimethod</code> library maintains a <strong>hash map</strong>, where the keys correspond to the values emitted by the dispatch function and the values are the methods. When you call the multimethod, the library queries the hash map to find the implementation that corresponds to the dispatched value.</p>
<p>In terms of usage, we call a multimethod as a regular function:</p>
<pre><code class="language-klipse-eval-js">greet(myCow);
</code></pre>
<p>And if by mistake we pass an animal that doesn’t have a corresponding method, we get a <code class="language-plaintext highlighter-rouge">NoMethodError</code> exception:</p>
<pre><code class="language-klipse-eval-js">var myHorse = {
"type": "horse",
"name": "Horace"
};
greet(myHorse);
</code></pre>
<p>Unless you can declare a <strong>default method</strong>:</p>
<pre><code class="language-klipse-eval-js">function defaultGreet(animal) {
console.log("My name is " + animal.name);
}
greet = method(defaultGreet)(greet);
</code></pre>
<p>Now our horse can greet:</p>
<pre><code class="language-klipse-eval-js">greet(myHorse);
</code></pre>
<h1 id="multimethods-with-multiple-dispatch">Multimethods with multiple dispatch</h1>
<p>So far, we have mimicked OOP by having as a dispatch value the type of the multimethod argument. But if you think again about the flow of a multimethod, you will discover something interesting: in fact the dispatch function could emit any value.</p>
<p><img src="/assets/multimethod.png" alt="multi" /></p>
<p>For instance, we could emit the type of two arguments!</p>
<p>Imagine that our animals are <strong>polyglot</strong>.</p>
<p><em>Polyglot</em> comes from the Greek <em>polús</em> meaning <em>much</em> and <em>glôssa</em> meaning <em>language</em>. A polyglot is a person speaking many languages.</p>
<p>Let’s say our animals speak English and French.</p>
<p>We represent a language like we represent an animal, with a map having two fields: <code class="language-plaintext highlighter-rouge">type</code> and <code class="language-plaintext highlighter-rouge">name</code>.</p>
<pre><code class="language-klipse-eval-js">var french = {
"type": "fr",
"name": "Français"
};
var english = {
"type": "en",
"name": "English"
};
</code></pre>
<p>Now, let’s write the code for the <strong>dispatch function</strong> and the <strong>methods</strong> for our polyglot animals. Let’s call our multimethod: <code class="language-plaintext highlighter-rouge">greetLang</code>. We have:</p>
<ol>
<li>one dispatch function</li>
<li>6 methods: 3 animals (dog, cat, cow) times 2 languages (en, fr).</li>
</ol>
<p>But before the implementation I’d like to draw a flow diagram. It will help me to make things crystal clear.</p>
<p><img src="/assets/multimethod-animal-polyglot.png" alt="multi-single-dispatch" /></p>
<p>I omitted the arrow between the arguments and the methods in order to keep the diagram readable. Otherwise there would be too many arrows.</p>
<p>The dispatch function is going to return an array with two elements: the type of the animal and the type of the language:</p>
<pre><code class="language-klipse-eval-js">var greetLang = multi((animal, language) => [animal.type, language.type]);
</code></pre>
<blockquote>
<p>A dispatch function could emit any value. It gives us more flexibility than with OOP polymorphism</p>
</blockquote>
<p>The order of the elements in the array It doesn’t matter but it needs to be consistent with the wiring of the methods.</p>
<p>Now, let’s implement the 6 methods:</p>
<pre><code class="language-klipse-eval-js">function greetLangDogEn(animal, language) {
console.log("Woof woof! My name is " + animal.name + " and I speak " +
language.name);
}
greetLang = method(["dog", "en"], greetLangDogEn)(greetLang);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetLangDogFr(animal, language) {
console.log("Ouaf Ouaf! Mon nom est " + animal.name + " et je parle " +
language.name);
}
greetLang = method(["dog", "fr"], greetLangDogFr)(greetLang);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetLangCatEn(animal, language) {
console.log("Meow! I am " + animal.name + " and I speak " + language.name);
}
greetLang = method(["cat", "en"], greetLangCatEn)(greetLang);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetLangCatFr(animal, language) {
console.log("Miaou! Je m'appelle " + animal.name + " et je parle " + language.name);
}
greetLang = method(["cat", "fr"], greetLangCatFr)(greetLang);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetLangCowEn(animal, language) {
console.log("Moo! Call me " + animal.name + " and I speak " + language.name);
}
greetLang = method(["cow", "en"], greetLangCowEn)(greetLang);
</code></pre>
<pre><code class="language-klipse-eval-js">function greetLangCowFr(animal, language) {
console.log("Meuh! Appelle moi " + animal.name + " et je parle " + language.name);
}
greetLang = method(["cow", "fr"], greetLangCowFr)(greetLang);
</code></pre>
<p>Take a closer look at the code for the methods that deal with French and tell me if you are surprised to see “Ouaf Ouaf”
instead of “Woof Woof” for dogs, “Miaou” instead of “Meow” for cats and “Meuh” instead of “Moo” for cows. I find it funny that that animal <strong>onomatopoeia</strong> are different in French than in English!</p>
<p><em>Onomatopoeia</em> comes also from the Greek: <em>ónoma</em> means <em>name</em> and <em>poiéō</em> means <em>to produce</em>. It is the property of words that sound like what they represent. For instance, Woof, Meow and Moo.</p>
<p><strong>Multiple dispatch</strong> is when a dispatch function emits a value that depends on more than one argument.</p>
<p>Let’s see our multimethod in action and ask our dog <code class="language-plaintext highlighter-rouge">Fido</code> to greet in French:</p>
<pre><code class="language-klipse-eval-js">greetLang(myDog, french);
</code></pre>
<h1 id="multimethods-with-dynamic-dispatch">Multimethods with dynamic dispatch</h1>
<p><strong>Dynamic dispatch</strong> is when the dispatch function of a multimethod returns a value that goes <strong>beyond the static type</strong> of its arguments, like for instance a number or a boolean.</p>
<p>Imagine that instead of being polyglot our animals would suffer from <strong>dysmakrylexia</strong>.</p>
<p><em>Dysmakrylexia</em> comes from the Greek <em>dus</em> expressing the idea of <em>difficulty</em>, <em>makrýs</em> meaning <em>long</em> and <em>léxis</em> that means <em>diction</em>. Therefore, dysmakrilexia is a difficulty to pronounce long words.</p>
<p>It’s not a real word, I invented it for the purpose of this article!</p>
<p>Let’s say that when their name has more than 5 letters an animal is not able to tell it.</p>
<p>Let’s call our multimethod <code class="language-plaintext highlighter-rouge">dysGreet</code>.</p>
<p><img src="/assets/multimethod-dys.png" alt="multi-single-dispatch" /></p>
<p>Its dispatch function returns an array with two elements: the animal type and a boolean about whether the name is long or not:</p>
<pre><code class="language-klipse-eval-js">var dysGreet = multi(animal => [animal.type, animal.name.length > 5]);
</code></pre>
<p>And here are the methods:</p>
<pre><code class="language-klipse-eval-js">function dysGreetDogShort(animal) {
console.log("Woof woof! My name is " + animal.name);
}
dysGreet = method(["dog", false], dysGreetDogShort)(dysGreet);
</code></pre>
<pre><code class="language-klipse-eval-js">function dysGreetDogLong(animal) {
console.log("Woof woof!");
}
dysGreet = method(["dog", true], dysGreetDogLong)(dysGreet);
</code></pre>
<pre><code class="language-klipse-eval-js">function dysGreetCatShort(animal) {
console.log("Meow! I am " + animal.name);
}
dysGreet = method(["cat", false], dysGreetCatShort)(dysGreet);
</code></pre>
<pre><code class="language-klipse-eval-js">function dysGreetCatLong(animal) {
console.log("Meow!");
}
dysGreet = method(["cat", true], dysGreetCatLong)(dysGreet);
</code></pre>
<pre><code class="language-klipse-eval-js">function dysGreetCowShort(animal) {
console.log("Moo! Call me " + animal.name);
}
dysGreet = method(["cow", false], dysGreetCowShort)(dysGreet);
</code></pre>
<pre><code class="language-klipse-eval-js">function dysGreetCowLong(animal) {
console.log("Moo!");
}
dysGreet = method(["cow", true], dysGreetCowLong)(dysGreet);
</code></pre>
<p>And now, if we ask Clarabelle to greet, she omits her name:</p>
<pre><code class="language-klipse-eval-js">dysGreet(myCow)
</code></pre>
<h1 id="multimethods-in-other-languages">Multimethods in other languages</h1>
<p>Multimethods are available in many languages, beside JavaScript. In Common LISP and Clojure, they are part of the language. In Python, there is a library called <a href="https://github.com/weissjeffm/multimethods">multimethods</a> and in Ruby there is <a href="https://github.com/psantacl/ruby-multimethods">Ruby multimethods</a>. Both work quite like JavaScript arrows/multimethod.</p>
<p>In Java, there is the <a href="http://igm.univ-mlv.fr/~forax/works/jmmf/">Java Multimethod Framework</a> and C# supports multimethods natively via the <code class="language-plaintext highlighter-rouge">dynamic</code> keyword. However, in both cases, it works only with static data types and not with generic data structures. Also, dynamic dispatch is not supported.</p>
<h1 id="wrapping-up">Wrapping up</h1>
<p><strong>Multimethods</strong> make it possible to benefit from <strong>polymorphism</strong> when <strong>data</strong> is represented with <strong>generic maps</strong>. Multimethods are made of a <strong>dispatch function</strong> that emits a dispatch value and <strong>methods</strong> that provide implementations for the dispatch values.</p>
<p>In the simplest case (<strong>single dispatch</strong>), the multimethod receives a single map that contains a type field and the dispatch function of the multimethod emits the value of the type field. In more advanced cases (<strong>multiple dispatch</strong> and <strong>dynamic dispatch</strong>), the dispatch function emits an arbitrary value that depends on several arguments.</p>
<script src="https://viebel.github.io/klipse/repo/js/multimethod.js"></script>
<script>var {multi, method, fromMulti} = window.multimethod;</script>Yehonathan SharvitObject-Oriented Programming is well known for allowing different classes to be called with the same interface, via a mechanism called polymorphism. It may seem that the only way to have polymorphism in a program is with objects. In fact, as we are going to see in this article it is possible to have polymorphism without objects via multimethods.Polymorphism without objects via multimethods2021-10-02T22:12:21+02:002021-10-02T22:12:21+02:00/javascript/2021/10/02/multimethod<p><strong>Object-Oriented Programming</strong> is well known for allowing different classes to be called with the same interface, via a mechanism called <strong>polymorphism</strong>. It may seem that the only way to have polymorphism in a program is with objects. In fact, as we are going to see in this article it is possible to have <strong>polymorphism without objects</strong> via <strong>multimethods</strong>.</p>
<p><em>This article has been revised and improved. The revised version is available <a href="/javascript/2021/10/04/multimethod.html">here</a></em></p>
<p>Moreover, multimethods provide more advanced polymorphism than OOP polymorphism as they support cases where the chosen implementation depends on several argument types (multiple dispatch) and even on the dynamic value of the arguments (dynamic dispatch).</p>
<p>This article covers:</p>
<ol>
<li>Mimicking objects with multimethods (Single dispatch)</li>
<li>Multimethods where implementations depend on several argument types (Multiple dispatch)</li>
<li>Multimethods where implementations depend dynamically on several arguments (Dynamic dispatch)</li>
</ol>
<h1 id="the-essence-of-polymorphism">The essence of polymorphism</h1>
<p>In OOP, <strong>polymorphism</strong> is about defining an <strong>interface</strong> and having <strong>different classes</strong> that implement the same interface in different ways.</p>
<p>Let’s illustrate polymorphism with an adaptation of the classic OOP polymorphism example: animal greetings. Let’s say that our animals are <strong>anthropomorphic</strong> and each of them has its own way to greet, by emitting its preferred sound and telling its name.</p>
<p><em>Anthropomorphism</em> is our first word that comes from the Greek: it comes from the Greek <em>ánthrōpos</em> that means <em>human</em> and <em>morphē</em> that means <em>form</em>.</p>
<p>In fact, it’s our second word that comes from the Greek. The first one was <em>polymorphism</em> coming from the Greek <em>polús</em> that means <em>many</em> and <em>morphē</em> that means <em>form</em>. Polymorphism is the ability of different objects to implement in different ways the same method.</p>
<p>In Java, for instance, we’d define a <code class="language-plaintext highlighter-rouge">IAnimal</code> interface with a <code class="language-plaintext highlighter-rouge">greet</code> method and each animal class would implement <code class="language-plaintext highlighter-rouge">greet</code> in its own way, like this:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">interface</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">();</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Dog</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Woof woof! My name is "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Cat</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Meow! I am "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="kd">class</span> <span class="nc">Cow</span> <span class="kd">implements</span> <span class="nc">IAnimal</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">name</span><span class="o">;</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">greet</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Moo! Call me "</span> <span class="o">+</span> <span class="n">animal</span><span class="o">.</span><span class="na">name</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>Now, let’s ask ourselves: what is the <strong>fundamental</strong> difference between OOP polymorphism and a <strong>naive switch statement</strong>?</p>
<p>Let me tell you what I mean by a naive switch statement. We could, as <a href="https://www.manning.com/books/data-oriented-programming?utm_source=viebel&utm_medium=affiliate&utm_campaign=book_sharvit2_data_1_29_21&a_aid=viebel&a_bid=d5b546b7">Data-Oriented programming</a> recommends, represent an animal with a <strong>map</strong> having two <strong>fields</strong> <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">type</code> and call a different piece of code depending on the value of <code class="language-plaintext highlighter-rouge">type</code>, like this:</p>
<pre><code class="language-klipse-eval-js">function greet(animal) {
switch (animal.type) {
case "dog":
console.log("Woof Woof! My name is: " + animal.name);
break;
case "cat":
console.log("Meow! I am: " + animal.name);
break;
case "cow":
console.log("Moo! Call me " + animal.name);
break;
};
}
</code></pre>
<p>It makes me think that we have not yet met our animals. For no further due, I am happy to present our heroes: Fido, Milo and Clarabelle.</p>
<p><img src="/assets/fido-milo-clarabelle.jpg" alt="Fido" /></p>
<pre><code class="language-klipse-eval-js">var myDog = {
"type": "dog",
"name": "Fido"
};
var myCat = {
"type": "cat",
"name": "Milo"
};
var myCow = {
"type": "cow",
"name": "Clarabelle"
};
</code></pre>
<p>The first difference between <strong>OOP polymorphism</strong> and our <strong>switch statement</strong> is that, if we pass an invalid map to the <code class="language-plaintext highlighter-rouge">greet</code> function, bad things will happen.</p>
<p>We could easily fix that by validating input data using <a href="/javascript/2021/09/30/data-validation-with-json-schema.html">JSON Schema</a></p>
<p>Another drawback of the switch statement approach is that when you want to <strong>modify</strong> the implementation of <code class="language-plaintext highlighter-rouge">greet</code> for a specific animal, you have to change the code that deals with all the animals, While in the OOP approach, we have to change only a specific animal class.</p>
<p>This could also be easily fixed by having a <strong>separate function</strong> for each animal, like this:</p>
<pre><code class="language-klipse-eval-js">function greetDog(animal) {
console.log("Woof Woof! My name is: " + animal.name);
}
function greetCat(animal) {
console.log("Meow! I am: " + animal.name);
}
function greetCow(animal) {
console.log("Moo! Call me " + animal.name);
}
function greet(animal) {
switch (animal.type) {
case "dog":
greetDog(animal);
break;
case "cat":
greetCat(animal);
break;
case "cow":
greetCow(animal);
break;
};
}
</code></pre>
<p>But what if you want to <strong>extend</strong> the functionality of greet and add a new animal?</p>
<p>Now, we got to the <strong>essence</strong> of polymorphism! With a switch statement, we cannot add a new animal without modifying the original code, while in OOP we can add a new class without having to modify the original code.</p>
<blockquote>
<p>The main benefit of polymorphism is that it makes the code easily extensible.</p>
</blockquote>
<p>Now, I have a surprise for you: We don’t need objects to make our code easily extensible. This is what we call: <strong>polymorphism without objects</strong>. And it is possible with <strong>multimethods</strong>.</p>
<h1 id="multimethods-with-single-dispatch">Multimethods with single dispatch</h1>
<p><strong>Multimethod</strong> is a software construct that provides <strong>polymorphism</strong> without the need for objects.</p>
<p>Multimethods are made of two pieces:</p>
<ol>
<li>A <strong>dispatch function</strong> that emits a <strong>dispatched value</strong></li>
<li>A set of <strong>methods</strong> that provide an <strong>implementation</strong> for each dispatched value.</li>
</ol>
<p>A dispatch function is similar to an interface in the sense that it defines the way the function needs to be called. But it goes beyond that as it also dispatches a value that differentiates between the different implementations.</p>
<p>Let me show illustrate how I would implement the animal greeting capabilities using a multimethod called <code class="language-plaintext highlighter-rouge">greet</code>. We need a dispatch function and 3 methods. Let’s call the dispatch function <code class="language-plaintext highlighter-rouge">greetDispatch</code>: it dispatches the animal type, either <code class="language-plaintext highlighter-rouge">"dog"</code>, <code class="language-plaintext highlighter-rouge">"cat"</code> or <code class="language-plaintext highlighter-rouge">"cow"</code>.</p>
<p>And each dispatch value is handled by a specific method:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">"dog"</code> by <code class="language-plaintext highlighter-rouge">greetDog</code></li>
<li><code class="language-plaintext highlighter-rouge">"cat"</code> by <code class="language-plaintext highlighter-rouge">greetCat</code></li>
<li><code class="language-plaintext highlighter-rouge">"cow"</code> by <code class="language-plaintext highlighter-rouge">greetCow</code>.</li>
</ul>
<p><img src="/assets/multimethod-animal.png" alt="multi-single-dispatch" /></p>
<p>In the diagram, there is an arrow between animal and the methods in addition to the arrow between animal and the dispatch function because the arguments of a multimethod are passed to the dispatch function and to the methods.</p>
<p>For now, our multimethod receives a single argument. But in the next section, it will receive several arguments.</p>
<p>Let’s see how a multimethod looks like in terms of code.</p>
<p>We start with the dispatch function <code class="language-plaintext highlighter-rouge">greetDispatch</code>: it defines the signature of the multimethod and emits the type of the animal as the dispatched value:</p>
<pre><code class="language-klipse-eval-js">function greetDispatch(animal) {
return animal.type;
}
</code></pre>
<p>Now, we need a method for each dispatch value. In our case, we’ll have <code class="language-plaintext highlighter-rouge">greetDog</code> for dogs, <code class="language-plaintext highlighter-rouge">greetCat</code> for cats and <code class="language-plaintext highlighter-rouge">greetCow</code> for cows:</p>
<pre><code class="language-klipse-eval-js">function greetDog(animal) {
console.log("Woof woof! My name is " + animal.name);
}
function greetCat(animal) {
console.log("Meow! I am " + animal.name);
}
function greetCow(animal) {
console.log("Moo! Call me " + animal.name);
}
</code></pre>
<blockquote>
<p>In the context of multimethods, a method is a function that provides an implementation for a dispatch value.</p>
</blockquote>
<p>On the one hand we have the greet dispatch function and on the other hand we have the different greet implementations. How do you <strong>wire</strong> everything together?</p>
<p>For that, we need a library. For instance, in JavaScript using a library named <a href="https://github.com/caderek/arrows/tree/master/packages/multimethod">arrows/multimethod</a>, we call <code class="language-plaintext highlighter-rouge">multi</code> to create a multimethod and <code class="language-plaintext highlighter-rouge">method</code> to create a method:</p>
<pre><code class="language-klipse-eval-js">var greet = multi(
greetDispatch,
method("dog", greetDog),
method("cat", greetCat),
method("cow", greetCow)
);
</code></pre>
<p>The names of the dispatch function and the methods are not really important. But I like to follow a simple <strong>naming convention</strong>: use the name of the multimethod as a <strong>prefix</strong> for the dispatch function and the methods and have the <code class="language-plaintext highlighter-rouge">Dispatch</code> <strong>suffix</strong> for the dispatch function and a specific <strong>suffix</strong> for each method.</p>
<p>Under the hood, the <code class="language-plaintext highlighter-rouge">arrows/multimethod</code> library maintains a <strong>hash map</strong>, where the keys are the values emitted by the dispatch function and the values are the methods. When you call method, the library adds an entry to the hash map and when you call the multimethod it queries the hash map to find the implementation that corresponds to the dispatch value.</p>
<p>In terms of usage, we call a multimethod as a regular function:</p>
<pre><code class="language-klipse-eval-js">greet(myCow);
</code></pre>
<h1 id="multimethods-with-multiple-dispatch">Multimethods with multiple dispatch</h1>
<p>So far, we have mimicked OOP by having as a dispatch value the type of the multimethod argument. But if you think again about the flow of a multimethod, you will discover something interesting: in fact the dispatch function could emit any value.</p>
<p><img src="/assets/multimethod.png" alt="multi" /></p>
<p>For instance, we could emit the type of two arguments!</p>
<p>Imagine that our animals are <strong>polyglot</strong>.</p>
<p><em>Polyglot</em> comes from the Greek <em>polús</em> meaning <em>much</em> and <em>glôssa</em> meaning <em>language</em>. A polyglot is a person speaking many languages.</p>
<p>Let’s say our animals speak English and French.</p>
<p>We represent a language like we represent an animal, with a map having two fields: <code class="language-plaintext highlighter-rouge">type</code> and <code class="language-plaintext highlighter-rouge">name</code>.</p>
<pre><code class="language-klipse-eval-js">var french = {
"type": "fr",
"name": "Français"
};
var english = {
"type": "en",
"name": "English"
};
</code></pre>
<p>Now, let’s write the code for the <strong>dispatch function</strong> and the <strong>methods</strong> for our polyglot animals. Let’s call our multimethod: <code class="language-plaintext highlighter-rouge">greetLang</code>. We have:</p>
<ol>
<li>one dispatch function</li>
<li>6 methods: 3 animals (dog, cat, cow) times 2 languages (en, fr).</li>
</ol>
<p>But before the implementation I’d like to draw a flow diagram. It will help me to make things crystal clear.</p>
<p><img src="/assets/multimethod-animal-polyglot.png" alt="multi-single-dispatch" /></p>
<p>I omitted the arrow between the arguments and the methods in order to keep the diagram readable. Otherwise there would be too many arrows.</p>
<p>The dispatch function is going to return an array with two elements: the type of the animal and the type of the language:</p>
<pre><code class="language-klipse-eval-js">function greetLangDispatch(animal, language) {
return [animal.type, language.type];
};
</code></pre>
<p>The order of the elements in the array It doesn’t matter but it needs to be consistent with the wiring of the methods.</p>
<p>Now, let’s implement the 6 methods:</p>
<pre><code class="language-klipse-eval-js">function greetLangDogEn(animal, language) {
console.log("Woof woof! My name is " + animal.name + " and I speak " +
language.name);
}
function greetLangDogFr(animal, language) {
console.log("Ouaf Ouaf! Mon nom est " + animal.name + " et je parle " +
language.name);
}
function greetLangCatEn(animal, language) {
console.log("Meow! I am " + animal.name + " and I speak " + language.name);
}
function greetLangCatFr(animal, language) {
console.log("Miaou! Je m'appelle " + animal.name + " et je parle " + language.name);
}
function greetLangCowEn(animal, language) {
console.log("Moo! Call me " + animal.name + " and I speak " + language.name);
}
function greetLangCowFr(animal, language) {
console.log("Meuh! Appelle moi " + animal.name + " et je parle " + language.name);
}
</code></pre>
<p>Take a closer look at the code for the methods that deal with French and tell me if you are surprised to see “Ouaf Ouaf”
instead of “Woof Woof” for dogs, “Miaou” instead of “Meow” for cats and “Meuh” instead of “Moo” for cows. I find it funny that that animal <strong>onomatopoeia</strong> are different in French than in English!</p>
<p><em>Onomatopoeia</em> comes also from the Greek: <em>ónoma</em> means <em>name</em> and <em>poiéō</em> means <em>to produce</em>. It is the property of words that sound like what they represent. For instance, Woof, Meow and Moo.</p>
<p>Anyway, after we have defined our <strong>dispatch function</strong> and our <strong>methods</strong>, we need to <strong>wire</strong> them altogether in a multimethod, like we did with <code class="language-plaintext highlighter-rouge">greet</code>. The only difference that the dispatch values are arrays of strings instead of strings:</p>
<pre><code class="language-klipse-eval-js">var greetLang = multi(
greetLangDispatch,
method(["dog", "en"], greetLangDogEn),
method(["dog", "fr"], greetLangDogFr),
method(["cat", "en"], greetLangCatEn),
method(["cat", "fr"], greetLangCatFr),
method(["cow", "en"], greetLangCowEn),
method(["cow", "fr"], greetLangCowFr)
);
</code></pre>
<p><strong>Multiple dispatch</strong> is when a dispatch function emits a value that depends on more than one argument.</p>
<p>Let’s see our multimethod in action and ask our dog <code class="language-plaintext highlighter-rouge">Fido</code> to greet in French:</p>
<pre><code class="language-klipse-eval-js">greetLang(myDog, french);
</code></pre>
<h1 id="multimethods-with-dynamic-dispatch">Multimethods with dynamic dispatch</h1>
<p><strong>Dynamic dispatch</strong> is when the dispatch function of a multimethod returns a value that goes <strong>beyond the static type</strong> of its arguments, like for instance a number or a boolean.</p>
<p>Imagine that instead of being polyglot our animals would suffer from <strong>dysmakrylexia</strong>.</p>
<p><em>Dysmakrylexia</em> comes from the Greek <em>dus</em> expressing the idea of <em>difficulty</em>, <em>makrýs</em> meaning <em>long</em> and <em>léxis</em> that means <em>diction</em>. Therefore, dysmakrilexia is a difficulty to pronounce long words.</p>
<p>It’s not a real word, I invented it for the purpose of this article!</p>
<p>Let’s say that when their name has more than 5 letters an animal is not able to tell it.</p>
<p>Let’s call our multimethod <code class="language-plaintext highlighter-rouge">dysGreet</code>.</p>
<p><img src="/assets/multimethod-dys.png" alt="multi-single-dispatch" /></p>
<p>Its dispatch function returns an array with two elements: the animal type and a boolean about whether the name is long or not:</p>
<pre><code class="language-klipse-eval-js">function dysGreetDispatch(animal) {
var hasLongName = animal.name.length > 5;
return [animal.type, hasLongName];
};
</code></pre>
<p>And here are the methods:</p>
<pre><code class="language-klipse-eval-js">function dysGreetDogShort(animal) {
console.log("Woof woof! My name is " + animal.name);
}
function dysGreetDogLong(animal) {
console.log("Woof woof!");
}
function dysGreetCatShort(animal) {
console.log("Meow! I am " + animal.name);
}
function dysGreetCatLong(animal) {
console.log("Meow!");
}
function dysGreetCowShort(animal) {
console.log("Moo! Call me " + animal.name);
}
function dysGreetCowLong(animal) {
console.log("Moo!");
}
</code></pre>
<p>As surprising as it may sound, wiring a multimethod with dynamic dispatch is as simple as wiring a multimethod with static dispatch:</p>
<pre><code class="language-klipse-eval-js">var dysGreet = multi(
dysGreetDispatch,
method(["dog", false], dysGreetDogShort),
method(["dog", true], dysGreetDogLong),
method(["cat", false], dysGreetCatShort),
method(["cat", true], dysGreetCatLong),
method(["cow", false], dysGreetCowShort),
method(["cow", true], dysGreetCowLong)
);
</code></pre>
<p>And now, if we ask Clarabelle to greet, she omits her name:</p>
<pre><code class="language-klipse-eval-js">dysGreet(myCow)
</code></pre>
<h1 id="multimethods-in-other-languages">Multimethods in other languages</h1>
<p>Multimethods are available in many languages, beside JavaScript. In Common LISP and Clojure, they are part of the language. In Python, there is a library called <a href="https://github.com/weissjeffm/multimethods">multimethods</a> and in Ruby there is <a href="https://github.com/psantacl/ruby-multimethods">Ruby multimethods</a>. Both work quite like JavaScript arrows/multimethod.</p>
<p>In Java, there is the <a href="http://igm.univ-mlv.fr/~forax/works/jmmf/">Java Multimethod Framework</a> and C# supports multimethods natively via the <code class="language-plaintext highlighter-rouge">dynamic</code> keyword. However, in both cases, it works only with static data types and not with generic data structures. Also, dynamic dispatch is not supported.</p>
<h1 id="wrapping-up">Wrapping up</h1>
<p><strong>Multimethods</strong> make it possible to benefit from <strong>polymorphism</strong> when <strong>data</strong> is represented with <strong>generic maps</strong>. Multimethods are made of a <strong>dispatch function</strong> that emits a dispatch value and <strong>methods</strong> that provide implementations for the dispatch values.</p>
<p>In the simplest case (<strong>single dispatch</strong>), the multimethod receives a single map that contains a type field and the dispatch function of the multimethod emits the value of the type field. In more advanced cases (<strong>multiple dispatch</strong> and <strong>dynamic dispatch</strong>), the dispatch function emits an arbitrary value that depends on several arguments.</p>
<script src="https://viebel.github.io/klipse/repo/js/multimethod.js"></script>
<script>var {multi, method, fromMulti} = window.multimethod;</script>Yehonathan SharvitObject-Oriented Programming is well known for allowing different classes to be called with the same interface, via a mechanism called polymorphism. It may seem that the only way to have polymorphism in a program is with objects. In fact, as we are going to see in this article it is possible to have polymorphism without objects via multimethods.