Building a Word Cloud with D3 and Svelte

In this article, we will create a data visualization that displays the frequency of words in the lyrics of a song under the Billboard Hot 100 list, Vampire, by Olivia Rodrigo, using D3 and Svelte. Which words do you think catapult a song to the Billboard Hot 100 list?When repeated enough times, words become memorable. Anytime you listen to a speech, notice how frequently certain words come up, how the repetition helps you recognize the importance of the speaker’s message. If you happen to only have a transcript of the speech, then you would need to read/skim through paragraphs of text to grasp the essence of the speaker's words and gain a complete understanding of the message being conveyed. With word clouds (also known as tag clouds ), you can visualize the frequency of words. Words are arranged in a cloud-shaped formation, and each word is sized and colored based on its frequency (or importance) in a given text. The more frequently a word appears, the larger (or more intense color-wise) it appears in the word cloud. This makes it easier to visually identify critical keywords and themes in textual content. Simultaneously, word clouds capture and summarize the essence of textual content in a single glance. Whether you are interested in seeing what trending topics are being discussed in online communities or what words leaders use to inspire their nations, a word cloud offers a clear window into any textual content. There’s a D3 module that’s available for generating word clouds: d3-cloud . This module automatically takes a mapping of words and their frequencies and determines how to properly size and position them in a word cloud with minimal collisions. However, since the pure D3 implementation of a word cloud involves appending an SVG <text /> element, one by one, each time a word gets processed: What happens if we want to update the word cloud using another set of words? Rather than having to manually manage the DOM using D3’s imperative API (i.e., manually removing all of the previous SVG <text /> elements, re-appending new SVG <text /> elements, etc.), we can let Svelte render elements to the DOM and keep the DOM in sync with our data via reactivity . This way, anytime our data changes, Svelte automatically updates the DOM accordingly. In Svelte, all assignments are reactive. If we wanted to mark any number of top-level statements reactive, like the above code snippet, then all we have to do is wrap them in curly braces and prefix the block with the $ label syntax. This results in reactive statements . Any values within the reactive block become dependencies of the reactive statement. When any of these values change, the reactive statement gets re-run. This is perfect in case we want our word cloud to update anytime we provide a different set of words. Below, I'm going to show you how to build a word cloud with D3 and Svelte. The word cloud will display the frequency of words in the lyrics of of a song under the Billboard Hot 100 list, Vampire, by Olivia Rodrigo. The larger the word, and the less faded the word is, the greater the frequency of word in the lyrics. By the end of this tutorial, you will have built the following word cloud: To set up a new Svelte project with Vite and TypeScript , run the command npm init vite . Note : You may generate a new Svelte application with SvelteKit, but this tutorial is only focused on building out a single Svelte component for the word cloud. Therefore, it’s more preferred to use a lighter template so that you don’t need to mess around with extra project files. For the word cloud visualization, we will need to install two specific D3 modules: Run the following command to install these D3 modules and their type definitions in the Svelte project. First, delete the src/lib directory and src/app.css file. Then, in src/main.ts , omit the import './app.css' statement at the top of the file. In the src/App.svelte file, clear out the contents of the script, style and markup sections. Within the script section, let’s add the import statement for the <WordCloud /> component and a variable named lyrics that’s set to the lyrics of the song Vampire, like so: ( src/App.svelte ) Within the style section, let’s add some minor styles to horizontally center the <WordCloud /> component in the <main /> element. ( src/App.svelte ) Note : Styles defined in the <App /> component won’t leak into other Svelte components. Within the <main /> element of the markup section, call the <WordCloud /> component. Also, pass lyrics to the text prop of the <WordCloud /> component, like so: ( src/App.svelte ) Within the src directory, create a new folder named components . This folder will contain any reusable components used in this Svelte application. In this case, there will only be one component in this directory: WordCloud.svelte . Create this file inside of the src/components directory. Within the src/components/WordCloud.svelte file, begin with an empty script section for the <WordCloud /> component: ( src/components/WordCloud.svelte ) At the top of the script section, import d3Cloud from the d3-cloud module. d3Cloud instantiates a new cloud layout instance, and it comes with chainable methods for configuring: Additionally, import three methods from the d3-array module: ( src/components/WordCloud.svelte ) Then, declare the text prop that the <WordCloud /> component currently accepts. Set its default value to an empty string if no value is passed to the text prop. ( src/components/WordCloud.svelte ) d3Cloud comes with a chainable method called .words() . This method accepts the words and their frequencies as an array of objects with two properties: To turn the string of text into an array of objects with these properties, we’ll need to: Add these lines of code to the script section of the <WordCloud /> component, like so: ( src/components/WordCloud.svelte ) Like with any D3 data visualization, you need to define its dimensions. The dimensions consist of: In the <WordCloud /> component’s markup section, add an <svg /> element and set its width , height and viewBox using the values from dimensions . Since the words will be displayed using the Helvetica font family, let’s set font-family to “Helvetica.” Note : text-anchor="middle" aligns the middle of the text to the text’s position. This is important since the layout algorithm determines positions using the middle of the text as the reference. By default, the start of the text gets aligned to the text’s position. Next, define a wordPadding variable that specifies the numerical padding to apply to each word in the word cloud. Since d3-cloud internally uses an HTML5 <canvas /> element to simulate the layout algorithm, this padding (in pixels) gets multiplied by 2, and this product gets set to the lineWidth property of the canvas’s drawing context. For now, we’ll set wordPadding to 2. Add these lines of code to the script section of the <WordCloud /> component, like so: ( src/components/WordCloud.svelte ) With all of the necessary variables set, let’s call d3Cloud() and configure it using the following chainable methods: Anytime a word is successfully placed in the canvas that’s used to simulate the layout algorithm, push an object with the calculated font size ( size ), coordinates ( x and y ), rotation ( rotate ) and the word itself to an array named cloudWords . Once everything is set up, call the .start() method on cloud to run the layout algorithm. However, remember that Svelte’s reactivity only gets triggered on assignments. Since the .push() method mutates the array, we cannot use cloudWords to render the list of words in the markup section of the <WordCloud /> component. Therefore, once the layout algorithm finishes running, assign cloudWords to words . Then, within the <svg /> element in the markup section of the <WordCloud /> component, use an each block to loop over the list of words and render a list of <text /> elements inside of a <g /> element (for grouping the <text /> elements), like so: Add these lines of code to the script section of the <WordCloud /> component, like so: ( src/components/WordCloud.svelte ) When you run the project in development via npm run dev , you should see a word cloud that looks like the following: Currently, the size of a word communicates its frequency in a block of text. The larger the word, the more frequent the word appears in the block of text. However, what if we wanted to also communicate a word’s frequency based on the word’s opacity? For example, the more faded a word is in the word cloud, the less frequent it appears in the block of text. To do this, we’ll need to use the extent() method from the d3-array module to determine the maximum frequency. Then, by dividing a word’s frequency from the maximum frequency, we get decimal values that can be set to the word’s <text /> element’s opacity attribute, like so: Try customizing the word cloud for your own textual data. If you find yourself stuck at any point while working through this tutorial, then feel free to check out the live demo of this project in the following CodeSandbox: If you want to learn more about building visualizations with D3 and Svelte, then check out the Better Data Visualizations with Svelte course by Connor Rothschild, a partner and data visualization engineer at Moksha Data Studio.

Thumbnail Image of Tutorial Building a Word Cloud with D3 and Svelte