Ownership and move semantics is one of the things that makes Rust unique. To understand this topic, you need to understand what Stack and Heap are at a basic level. I wrote a post about that! You can check it out if you need a refresher on those concepts. It is a little bit hard to get used to this feature because it forces you to think about stuff that you didn’t have to worry about in other languages. Enough introduction, let’s cut to the chase!

The three rules of ownership

There are three rules that governs the ownership system:

  1. Every initialized value has an owner: Every initialized value has a variable that is its owner.¹
  2. There is only one owner per value: You can’t have two or more variables that owns the same value in memory. You can’t share ownership between variables.²
  3. If a variable’s scope ends, its value gets freed: When a scope ends, all values owned by variables contained in that scope get automatically freed.

¹ But not every variable owns a value, they may just hold a reference. I’ll talk about this in the “References and Borrowing” article.
² Actually you can have more than one owner in safe Rust. You have to use special structures, such as Rc (multiple owners do not own the value directly though).

Let’s test the rules! But before that, a little reminder of how the String type is represented in memory:

String representation in memory

where:

  • ptr: A pointer to the first direction of the Heap containing the string itself (in this case hello).
  • len: How much memory, in bytes, the contents of the string is currently using.
  • capacity: The total amount of memory, in bytes, allocated for that string.

GDB

In this post, I am going to explore what is happening in memory using the GNU Debugger (gdb) with the special command rust-gdb:

$ rust-gdb ./target/debug/move_semantics

I am going to use the x command a lot to explore the stack and the $sp value (refers to the Stack Pointer).

Rule 1: Every initialized value has an owner.

Consider the following code:

1
2
3
4
5
6
7
8
9
fn hello_world() -> u32 {
    String::from("hello! I am a free initialized String!");
    println!("{}", 42);
    42
}

fn main() {
    hello_world();
}

In the hello_world function, we have an initialized String value that is free (not assigned to a variable). Did Rust initialize the value in memory or just ignore it? We can’t use it so… Why would Rust save it? Let’s check what happens! When we compile this code we get the following warning:

warning: unused return value of `from` that must be used
 --> src/main.rs:2:5
  |
2 |     String::from("hello! I am a free initialized String!");
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_must_use)]` on by default

Rust warns us that we must use the returned value of the String::from function, otherwise, we can’t access it in any way. What happens in memory? Let’s check it out with GDB!

First, we set a breakpoint at the beginning of the hello_world function and execute the String initialization:

Breakpoint 1, move_semantics::hello_world () at src/main.rs:2
2           String::from("hello! I am a free initialized String!");
(gdb) n
3           println!("{}", 42);

At this point, the String is initialized, but it isn’t assigned to a variable. So, it has no owner! Let’s check the stack:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
(gdb) x/80ub $sp
0x7fffffffd980: 0       240     127     255     255     127     0       0
0x7fffffffd988: 61      60      87      85      85      85      0       0
0x7fffffffd990: 16      90      90      85      85      85      0       0
0x7fffffffd998: 38      0       0       0       0       0       0       0
0x7fffffffd9a0: 38      0       0       0       0       0       0       0
0x7fffffffd9a8: 2       0       0       0       0       0       0       0
0x7fffffffd9b0: 48      0       0       0       0       0       0       0
0x7fffffffd9b8: 96      255     255     255     255     255     255     255
0x7fffffffd9c0: 0       240     127     255     255     127     0       0
0x7fffffffd9c8: 5       0       0       0       0       0       0       0

It seems that the String value is there, lines 4 to 6 looks like our initialized value: memory addresses 0x7fffffffd998 and 0x7fffffffd9a0 (lines 5 and 6) have a 38 stored, and the string happens to have 38 characters. 0x7fffffffd990 (line 4) must be the Heap address where the actual text is allocated! Let’s see what’s inside that memory address.

First, print the address as hexa:

(gdb) x/xg 0x7fffffffd990
0x7fffffffd990:	0x00005555555a5a10

Then, explore what’s inside that address!

1
2
3
4
5
6
(gdb) x/38cb 0x00005555555a5a10
0x5555555a5a10: 0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'        0 '\000'
0x5555555a5a18: 16 '\020'       80 'P'  90 'Z'  85 'U'  85 'U'  85 'U'  0 '\000'        0 '\000'
0x5555555a5a20: 101 'e' 101 'e' 32 ' '  105 'i' 110 'n' 105 'i' 116 't' 105 'i'
0x5555555a5a28: 97 'a'  108 'l' 105 'i' 122 'z' 101 'e' 100 'd' 32 ' '  83 'S'
0x5555555a5a30: 116 't' 114 'r' 105 'i' 110 'n' 103 'g' 33 '!'

Our String is mostly there! But, it appears that the beginning of it was overwritten. It’s ok, that value isn’t owned by any variable; we can’t access it. So, it doesn’t matter what happens to it.

NOTE: This memory exploration was done using a debug build. I am not really sure what happens if this code was compiled in release mode. I believe that Rust does not initialize the value as an optimization, because it is not used.

Rule 2: There’s only one owner per value

Consider the following code:

1
2
3
4
5
6
7
8
fn main() {
    // Create a new value, with s1 as owner
    let s1 = String::from("hello world!");
    // Move ownership from s1 to s2
    let s2 = s1;
    // Oops! compiler error, the value has been moved!
    println!("{}", s1);
}

When we try to compile this, we get:

error[E0382]: borrow of moved value: `s1`
 --> src/main.rs:7:20
  |
3 |     let s1 = String::from("hello world!");
  |         -- move occurs because `s1` has type `String`, which does not implement the `Copy` trait
4 |     // Move ownership from s1 to s2
5 |     let s2 = s1;
  |              -- value moved here
6 |     // Oops! compiler error, the value has been moved!
7 |     println!("{}", s1);
  |                    ^^ value borrowed here after move

What is happening here is that the ownership of the String "hello world!" is transferred from s1 to s2. Because of that, the compiler invalidates the access to s1.

The value was moved because the type String does not implement the Copy trait. This is used on types that can be fully allocated in the stack and can be duplicated by simply copying bits without much overload (duplicating data in the Heap is much more complicated). When a type implements the Copy trait, instead of having “move semantics” it has “copy semantics”. This is usually the case for primitive types:

1
2
3
4
5
6
fn main() {
    let n1 = 42;
    let n2 = n1;
    println!("{}", n1);
    println!("{} {}", n1, n2);
}

If we run this code…

cargo run
   Compiling move_semantics v0.1.0 (/home/rust/blog)
    Finished dev [unoptimized + debuginfo] target(s) in 0.30s
     Running `target/debug/move_semantics`
42
42 42

compiles! Because the value 42 is copied!

Rule 3: If a variable’s scope ends its value gets freed

Consider the following code:

1
2
3
4
5
6
7
8
fn main() {
    {
        // Create a new value with s1 as owner
        let s1 = String::from("hello world!");
    } // s1 gets dropped here! since is the end of the scope

    println!("Checking drop with gdb!");
}

s1 allocation will have been freed when we reach line 7. This is because the curly braces at the beginning of the main function creates a new scope. Once the code reaches the end of it, all the variables that it contained get dropped. Let’s check it out in GDB:

On line 4, we can find s1 in the locals variables of the scope:

Breakpoint 1, move_semantics::main () at src/main.rs:4
4	        println!("{}", s1);
(gdb) info locals
s1 = "hello world!"

Let’s check where the Heap allocation of s1 is and what value it contains (remember that the first field of the Stack representation is the pointer to the Heap):

(gdb) p &s1
$1 = (*mut alloc::string::String) 0x7fffffffd960
(gdb) x/xg 0x7fffffffd960
0x7fffffffd960:	0x00005555555a5ad0
(gdb) x/12c 0x00005555555a5ad0
0x5555555a5ad0:	104 'h'	101 'e'	108 'l'	108 'l'	111 'o'	32 ' '	119 'w'	111 'o'
0x5555555a5ad8:	114 'r'	108 'l'	100 'd'	33 '!'

But when the scope finishes…

7	    println!("Checking drop with gdb!");
(gdb) info locals
No locals.
(gdb) x/12c 0x00005555555a5ad0
0x5555555a5ad0:	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'	0 '\000'
0x5555555a5ad8:	16 '\020'	80 'P'	90 'Z'	85 'U'

All the locals variables were dropped and the memory occupied by them freed! That part of the Heap is now filled with something else (probably garbage).

Moving a value: What happens under the hood?

Consider the following code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
fn move_stack_example() {
    // Create a new String value
    let s1 = String::from("hello world!");
    // Move it from s1 to s2 (s2 takes ownership)
    let s2 = s1;

    println!("{}", s2);
}

fn main() {
    move_stack_example();
}

When we move a value, it does not dissapears from the memory, instead, whatever is in the stack that belongs to the moved value gets duplicated and the compiler just forbids us from accessing the old variable ever again.

Under the hood

Let’s verify what I just said with GDB. We are going to examine the stack frame of the move_stack_example function. First of all, let’s check the locals variables:

(gdb) info locals
s2 = "hello world!"
s1 = "hello world!"

Whoa! Looks like s1 and s2 have the same value! Actually they are pointing to the same value. Let’s now see what the addresses of s1 and s2 are:

(gdb) p &s1
$1 = (*mut alloc::string::String) 0x7fffffffdb08
(gdb) p &s2
$2 = (*mut alloc::string::String) 0x7fffffffdb20

Great! Now, we know that s1’s stack representation starts at 0x7fffffffdb08 and s2’s starts at 0x7fffffffdb20. Let’s now see the contents of the stack frame:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
(gdb) x/80bu $sp
0x7fffffffdaf0:	112	171	217	247	255	127	0	0
0x7fffffffdaf8:	7	125	221	247	255	127	0	0
0x7fffffffdb00:	2	0	0	0	0	0	0	0
0x7fffffffdb08:	208	90	90	85	85	85	0	0
0x7fffffffdb10:	12	0	0	0	0	0	0	0
0x7fffffffdb18:	12	0	0	0	0	0	0	0
0x7fffffffdb20:	208	90	90	85	85	85	0	0
0x7fffffffdb28:	12	0	0	0	0	0	0	0
0x7fffffffdb30:	12	0	0	0	0	0	0	0
0x7fffffffdb38:	0	0	0	0	0	0	0	0

What do we have at s1 and s2 addresses? Let’s check it out!:

  • ptr:
    • For s1 this value is at 0x7fffffffdb08.
    • For s2 this value is at 0x7fffffffdb20.
  • len:
    • For s1 this value is at 0x7fffffffdb10.
    • For s2 this value is at 0x7fffffffdb28.
  • capacity:
    • For s1 this value is at 0x7fffffffdb18.
    • For s2 this value is at 0x7fffffffdb30.

As you can see, both ptr values are the same, meaning that both variables are pointing to the same data in the Heap. Let’s print them in hexadecimal to get the correct format to explore it:

(gdb) x/xg 0x7fffffffdb08
0x7fffffffdb08: 0x00005555555a5ad0
(gdb) x/xg 0x7fffffffdb20
0x7fffffffdb20: 0x00005555555a5ad0

So, the ptr value is 0x00005555555a5ad0! Now, take a look at the contents of that address in the Heap:

(gdb) x/12c     0x00005555555a5ad0
0x5555555a5ad0:	104 'h'	101 'e'	108 'l'	108 'l'	111 'o'	32 ' '	119 'w'	111 'o'
0x5555555a5ad8:	114 'r'	108 'l'	100 'd'	33 '!'

The hello world! string is there!

Conclusion

It can take some time to get used to working with ownership and move semantics, but, in my opinion, that is well invested time. Manually managing memory (by allocating and freeing it) is not an easy task and can create several bugs. With Rust’s approach, those bugs are caught at compile time, so they can never happen!

If you want to read more about this topic, check out the Rust book.