<buttonid="sidebar-toggle"class="icon-button"type="button"title="Toggle Table of Contents"aria-label="Toggle Table of Contents"aria-controls="sidebar">
<ahref="print.html"title="Print this book"aria-label="Print this book">
<iid="print-button"class="fa fa-print"></i>
</a>
</div>
</div>
<divid="search-wrapper"class="hidden">
<formid="searchbar-outer"class="searchbar-outer">
<inputtype="search"name="search"id="searchbar"name="searchbar"placeholder="Search this book ..."aria-controls="searchresults-outer"aria-describedby="searchresults-header">
<p>Rust has many types that let you work with numbers, characters, and so on. Some are simple, others are more complicated, and you can even create your own.</p>
<p>Rust has simple types that are called <strong>primitive types</strong> (primitive = very basic). We will start with integers and <code>char</code> (characters). Integers are whole numbers with no decimal point. There are two types of integers:</p>
<ul>
<li>Signed integers,</li>
<li>Unsigned integers.</li>
</ul>
<p>Signed means <code>+</code> (plus sign) and <code>-</code> (minus sign), so signed integers can be positive or negative (e.g. +8, -8). But unsigned integers can only be positive, because they do not have a sign.</p>
<p>The signed integers are: <code>i8</code>, <code>i16</code>, <code>i32</code>, <code>i64</code>, <code>i128</code>, and <code>isize</code>.
The unsigned integers are: <code>u8</code>, <code>u16</code>, <code>u32</code>, <code>u64</code>, <code>u128</code>, and <code>usize</code>.</p>
<p>The number after the i or the u means the number of bits for the number, so numbers with more bits can be larger. 8 bits = one byte, so <code>i8</code> is one byte, <code>i64</code> is 8 bytes, and so on. Number types with larger sizes can hold larger numbers. For example, a <code>u8</code> can hold up to 255, but a <code>u16</code> can hold up to 65535. And a <code>u128</code> can hold up to 340282366920938463463374607431768211455.</p>
<p>So what is <code>isize</code> and <code>usize</code>? This means the number of bits on your type of computer. (The number of bits on your computer is called the <strong>architecture</strong> of your computer.) So <code>isize</code> and <code>usize</code> on a 32-bit computer is like <code>i32</code> and <code>u32</code>, and <code>isize</code> and <code>usize</code> on a 64-bit computer is like <code>i64</code> and <code>u64</code>.</p>
<p>There are many reasons for the different types of integers. One reason is computer performance: a smaller number of bytes is faster to process. For example, the number -10 as an <code>i8</code> is <code>11110110</code>, but as an <code>i128</code> it is <code>11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110110</code>. But here are some other uses:</p>
<p>Characters in Rust are called <code>char</code>. Every <code>char</code> has a number: the letter <code>A</code> is number 65, while the character <code>友</code> ("friend" in Chinese) is number 21451. The list of numbers is called "Unicode". Unicode uses smaller numbers for characters that are used more, like A through Z, or digits 0 through 9, or space.</p>
let space = ' '; // A space inside ' ' is also a char
let other_language_char = 'Ꮔ'; // Thanks to Unicode, other languages like Cherokee display just fine too
let cat_face = '😺'; // Emojis are chars too
}
</code></pre></pre>
<p>The characters that are used most have numbers less than 256, and they can fit into a <code>u8</code>. Remember, a <code>u8</code> is 0 plus all the numbers up to 255, for 256 in total. This means that Rust can safely <strong>cast</strong> a <code>u8</code> into a <code>char</code>, using <code>as</code>. ("Cast <code>u8</code> as <code>char</code>" means "pretend <code>u8</code> is a <code>char</code>")</p>
<p>Casting with <code>as</code> is useful because Rust is very strict. It always needs to know the type, and won't let you use two different types together even if they are both integers. For example, this will not work:</p>
<pre><preclass="playground"><codeclass="language-rust">fn main() { // main() is where Rust programs start to run. Code goes inside {} (curly brackets)
let my_number = 100; // We didn't write a type of integer,
// so Rust chooses i32. Rust always
// chooses i32 for integers if you don't
// tell it to use a different type
println!("{}", my_number as char); // ⚠️
}
</code></pre></pre>
<p>Here is the reason:</p>
<pre><codeclass="language-text">error[E0604]: only `u8` can be cast as `char`, not `i32`
--> src\main.rs:3:20
|
3 | println!("{}", my_number as char);
| ^^^^^^^^^^^^^^^^^
</code></pre>
<p>Fortunately we can easily fix this with <code>as</code>. We can't cast <code>i32</code> as a <code>char</code>, but we can cast an <code>i32</code> as a <code>u8</code>. And then we can do the same from <code>u8</code> to <code>char</code>. So in one line we use <code>as</code> to make my_number a <code>u8</code>, and again to make it a <code>char</code>. Now it will compile:</p>
let my_number: u8 = 100; // change my_number to my_number: u8
println!("{}", my_number as char);
}
</code></pre></pre>
<p>So those are two reasons for all the different number types in Rust. Here is another reason: <code>usize</code> is the size that Rust uses for <em>indexing</em>. (Indexing means "which item is first", "which item is second", etc.) <code>usize</code> is the best size for indexing because:</p>
<ul>
<li>An index can't be negative, so it needs to be a number with a u</li>
<li>It should be big, because sometimes you need to index many things, but</li>
<li>It can't be a u64 because 32-bit computers can't use u64.</li>
</ul>
<p>So Rust uses <code>usize</code> so that your computer can get the biggest number for indexing that it can read.</p>
<p>Let's learn some more about <code>char</code>. You saw that a <code>char</code> is always one character, and uses <code>''</code> instead of <code>""</code>.</p>
<p>All chars are 4 bytes. They are 4 bytes because some characters in a string are more than one byte. Basic letters that have always been on computers are 1 byte, later characters are 2 bytes, and others are 3 and 4. A <code>char</code> needs to be 4 bytes so that it can hold any kind of character.</p>
<p>We can use <code>.len()</code> to see this for ourselves:</p>
println!("{}", "a".len()); // .len() gives the size in bytes
println!("{}", "ß".len());
println!("{}", "国".len());
println!("{}", "𓅱".len());
}
</code></pre></pre>
<p>This prints:</p>
<pre><codeclass="language-text">1
2
3
4
</code></pre>
<p>You can see that <code>a</code> is one byte, the German <code>ß</code> is two, the Japanese <code>国</code> is three, and the ancient Egyptian <code>𓅱</code> is 4 bytes.</p>
println!("Slice is {} bytes.", slice.len());
let slice2 = "안녕!"; // Korean for "hi"
println!("Slice2 is {} bytes.", slice2.len());
}
</code></pre></pre>
<p>This prints:</p>
<pre><codeclass="language-text">Slice is 6 bytes.
Slice2 is 7 bytes.
</code></pre>
<p><code>slice</code> is 6 characters in length and 6 bytes, but <code>slice2</code> is 3 characters in length and 7 bytes.</p>
<p>If <code>.len()</code> gives the size in bytes, what about the size in characters? We will learn about these methods later, but you can just remember that <code>.chars().count()</code> will do it. <code>.chars().count()</code> turns what you wrote into characters and then counts how many there are.</p>