The Rust Language

Contents

Basic I/O
- Dependencies
Data
- Shadowing
- Data Types
Ownership
- The Stack and the Heap
  - Rules of Ownership
References and Borrowing
The Slice Type
Structs
- Methods
  - C++ and C -> operator
Enums
Crates and Modules
Common Collections
- Vectors
  - Iteration:
- String
HashMaps
Error Handling
- Error Propagation
Generic Types, Traits and Lifetimes.
- Traits
- Lifetimes
Testing
Functional Programming in Rust
Iterators
Extra Patterns
Destructuring

Start a New Project using: cargo new [Project Name]
Build with: cargo build
Build and Run: cargo run
Check if it will compile without actually compiling: cargo check
Build for release with optimisation: cargo build --release

Basic I/O

// add the library needed for processing inputs from the standard library
use std::io;

fn main() {
	// println! is used for output
    println!("Guess the number!");

    println!("Please input your guess.");

	// define a mutable variable set to a new instance of a String
    let mut guess = String::new();

	// read an input
    io::stdin()
        .read_line(&mut guess) // set the input to the guess varaible
        // &mut means we borrow the guess varaible and set it
        .expect("Failed to read line");

    println!("You guessed: {guess}");
}

println! is a macro that prints to the screen. This line creates a new variable named apples and binds it to the value 5. In Rust, variables are immutable by default, meaning once we give the variable a value, the value won’t change. Therefore we add mut to make it mutable.

::new() indicates that the new function is an associated function of the String type. An Associated function is one that is implemented on a type.

If we hadn’t imported the io library with use std::io; at the beginning of the program, we could still use the function by writing this function call as std::io::stdin. The stdin function returns an instance of std::io::Stdin, which is a type that represents a handle to the standard input for your terminal.

The & marks a reference to the guess varaiable.

read_line puts whatever the user enters into the string we pass to it, but it also returns a Result value. Result is an enumeration, often called an enum, which is a type that can be in one of multiple possible states. We call each possible state a variant:

Rust strings are format strigns by default: {} set of curly brackets is a placeholder: little crab pincers that hold a value in place.

Dependencies

We can add a dependency by adding it to the Cargo.toml file and running cargo build to install it. We can use the installed Rng module for random number generation:

use rand::Rng;
let secret_number = rand::thread_rng().gen_range(1..=100);

Rng is a trait defines a method that generates a random number. We call the rand::thread_rng() function to give use a generator we will use, then we call the gen_range method.

use std::io;
use std::cmp::Ordering;
use rand::Rng;

fn main() {
    println!("Guess the number!");
    
	let secret_number = rand::thread_rng().gen_range(1..=100);

	// loop lets us make an infinte loop
	loop {
	    println!("Please input your guess.");
    
	    let mut guess = String::new();

	    io::stdin()
	        .read_line(&mut guess) 
	        .expect("Failed to read line");

	    // Rust uses a i32 type by default (a 32 bit number)
		// secret number is i32 and guess is a string and it can't compare it
		// we define a type on the var and prase will convert it to that type
		let guess: u32 = guess.trim().parse().expect("Please be a number");
	
	    println!("You guessed: {guess}");
		
		match guess.cmp(&secret_number) {
			Ordering::Less    => println!("Too Small!"),
			Ordering::Greater => println!("Too big!"),
			Ordering::Equal   => {
				println!("You Win!");
				break;
			}
		}
	}
}

Ordering is an Enum which has the 3 outcomes which result in the cmp compare functions. Rust has a couple number types:

i32 = a 32 bit number
u32 = an unsigned 32 bit number
i64 = a 64 bit number
u64 = an unsigned 64 bit number

We already have a guess variable but shadowing lets use define a new guess variable.

We should also handle errors:

      let guess: u32 = match guess.trim().parse() {
            Ok(num) => num,
            Err(_) => continue,
        };

Data

Variables in Rust are by default immutable. You can make them mutable by using the mut keyword:

let mut x = 5;

Constants are made like this and need a type:

const TIME : u32 = 60 * 60 * 3

Shadowing

We can declare another variable with the same name as a previous one. Rustaceans say that the first variable is shadowed by the second, which means that the second variable is what the compiler will see when you use the name of the variable. The second variable overshadows the first.

fn main() {
    let x = 5;

    let x = x + 1;

    {
        let x = x * 2;
        println!("The value of x in the inner scope is: {x}");
    }

    println!("The value of x is: {x}");
}

If you don’t use let you are trying a modify a variable which isn’t mutable so it will throw an error. When we use let we are just defining a new variable and using the value of the previous one.

If a variable is mutable you can mutate its value but not its type.

Data Types

In rust a data type can either be a scalar or a compound.

Scalar Types

Integer:

Length	Signed	Unsigned
8-bit	i8	u8
16-bit	i16	u16
32-bit	i32	u32
64-bit	i64	u64
128-bit	i128	u128
arch	isize	usize

isize and usize are relative to the architecture the code is running on… if you are running on a 64 bit architecture then it will be 64 bits.

Number literals	Example
Decimal	`98_222`
Hex	`0xff`
Octal	`0o77`
Binary	`0b1111_0000`
Byte (`u8` only)	`b'A'`

The primary situation in which you’d use isize or usize is when indexing some sort of collection. Rust uses the term panicking when a program exits with an error; often if you overflow your specified integer size.

In Rust floating point numbers are either f32 or f64 the default for a modern CPU is the 64-bit number.

let score: f64 = 89.999999;

We can use bool to specify a Boolean type:

let t : bool = false;

Character Type:

fn main() {
    let c = 'z';
    let z: char = 'ℤ'; // with explicit type annotation
    let heart_eyed_cat = '😻';
}

Rust Char uses 4 bytes. This means it can represent a lot more than ascii it can do Accented letters, Chinese, Japanese, Korean Characters, Emojis, etc… It can also represent Unicode even though that is technically not a character.

Compound Types

We can use tuples in Rust;

fn main() {
    let tup: (i32, f64, u8) = (500, 6.4, 1);
	let (x, y, z) = tup;
}

Arrays in rust have a fixed length:

fn main() {
	// array a with 5 elements with a type of i32
    let a: [i32; 5] = [1, 2, 3, 4, 5];

	// array b will look like this: [3,3,3,3,3]
	let b: [3; 5]
	println!(a[0]) // 1
}

a has a length of 5 and we can’t change that. A Arrays are useful when you want your data allocated on the stack rather than the heap. A vector is a collection type which can grow in length.

Functions

Statements are instructions that perform some action and do not return a value.
Expressions evaluate to a resultant value.

Rust code uses snake case as the conventional style for function and variable names, in which all letters are lowercase and underscores separate words.

Rust doesn’t care where you define a function as long as its in scope. Rust is an expression based language, if we create a new scope in a function that is an expression:

fn main() {
    let y = {
        let x = 3;
        x + 1
    };

    println!("The value of y is: {y}");
}

Here y is bound to 4 because that score return 4 implictly. Note: that the expression x+1 doesn’t end in a ; this is because if you add a semi colon it will become a statement and won’t be returned from the function:

fn plus_one(x: i32) -> i32 {
	x + 1
}

Control Flow

fn main() {
    let number = 3;

    if number < 5 {
        println!("condition was true");
    } else if number > 20 {
	    println!("Wowie Zowie");
	else{
        println!("condition was false");
    }
}

Rust doesn’t auto convert everything into a boolean like most languages… hence you can’t do this:

fn main() {
    let number = 3;

    if number {
        println!("number was three");
    }
}

Without casting to a bool first. Instead you can say if number == 3

We ca use if in a let statement:

let condition = true;
// this will error if the types of 5 and 6 don't match
let number = if condition { 5 } else { 6 };

Blocks of code evaluate to the last expression in them, and numbers by themselves are also expressions. In this case, the value of the whole if expression depends on which block of code executes.

We can return values from a loop using a break.

fn main() {
    let mut counter = 0;

    let result = loop {
        counter += 1;

        if counter == 10 {
            break counter * 2;
        }
    };

    println!("The result is {result}");
}

We can label loops if you have a lot of them:

fn main() {
    let mut count = 0;
    'counting_up: loop {
        println!("count = {count}");
        let mut remaining = 10;

        loop {
            println!("remaining = {remaining}");
            if remaining == 9 {
                break;
            }
            if count == 2 {
                break 'counting_up;
            }
            remaining -= 1;
        }

        count += 1;
    }
    println!("End count = {count}");
}

A While loop is conditional:

fn main() {
    let mut number = 3;

    while number != 0 {
        println!("{number}!");

        number -= 1;
    }

    println!("LIFTOFF!!!");
}

For loops!!

fn main() {
    let a = [10, 20, 30, 40, 50];

    for element in a {
        println!("the value is: {element}");
    }
}

Ranged loops, rev is only there to reverse the tuple.

fn count_down() {
    for number in (1..4).rev() {
        println!("{number}!");
    }
    println!("LIFTOFF!!!");
}

Ownership

Rust doesn’t have a garbage collector so we manage memory using ownership.

Ownership = a set of rules that govern how a Rust program manages memory. Most languages have a garbage collector which looks through and removes unused memory, others like C require you to explicitly allocate and free memory for data. Rust uses neither of these instead it uses a system of ownership.

The Stack and the Heap

Where in computer memory data is allocated may be useful for determining ownership, A Stack in computer memory is a LIFO data structure. We can push data onto the stack or pop it off. All data on the stack has to have a fixed size, data with a size unknown at compile time has to go on the heap. The heap is less organised, when add data on the heap ypu request a certain amount of memory. An allocator then finds an empty space that is big enough, and returns a pointer to the start of the data this is allocation. The pointer is a fixed size so you can store that on the stack.

To get the actual data you have to follow the pointer.
Pushing to the stack is faster than the heap because there is no allocator involved and you don’t have to look for a free space.
Heap access is slower since you have to follow a pointer.

When you call a function, the vaukes passed in and local function vars are pushed onto the stack, when the function is done the values are popped off the stack.

Keeping track of data on the heap, minimising duplicates and cleaning up are problems ownership solves.

Rules of Ownership

Each value has an owner.
There can only be one owner at a time.
When the owner goes out of scope the value is dropped.

In Rust a variable is valid for as long as it’s in scope.

The String data type is one that doesn’t have a fixed size, so it is allocated on the heap and is mutable. You make one using a string literal and from:

let mut s = String::from("hello");
s.push_str(", world!");
println!("{}", s); // "hello, world!"

For string literals we know its content at compile time so it is hardcoded in the executable. This makes string literals very fast. String allocates memory on the heap instead. We need a way to return this memory to the allocator when it’s done.

String::from allocates memory on the heap. In other languages like C we could need to free() this memory when we’re done, and in garbage collected languages this is automatic. In Rust memory is autometically freed when we leave the scope.

When it goes out of scope rust calls a function called drop to free the memory.

This pattern of deallocating resources at the end of an item’s lifetime is sometimes called Resource Acquisition Is Initialization (RAII). In C++ this is done via malloc and free

The stack holds the ptr, name, len and capacity of a String which is itself stored on the heap.

let s1 = String::from("hello");
let s2 = s1;

If we do that we don’t copy the data in the heap we just add another pointer to that same address on the stack just under a different name. This can be an issue because when they go out of scope drop will try to free the same bit of memory twice, so you get a double free error.

After the let s2 = s1; rust no longer considers the s1 variable as valid to avoid double freeing, since s1 is now out of scope and is dropped. This can be viewed as shallow copy with invalidation of the initial variable, we call this a move. If you try to use the dropped value Rust will tell you it had been moved.

If however you do want to deep copy the heap data of String, not just the stack we use the clone method.

let s1 = String::from("Hello");
let s2 = s1.clone();

For primitives however this doesn’t apply, variables aren’t dropped when you add a reference to them under a new name.

let x = 5;
let y = x;

x and y both exist because copying data in the stack is quick to do hence rust can just copy it. For these types there is no concept of moves, or deep copies. We can annotate these using a Copy trait so we know it’s a copy, these variables on the stack aren’t moved but are copied. We can’t use Copy if the type implements Drop.

Ownership and Functions When you pass a value to a function the variable is moved, the function takes ownership of it automatically and drops it when it’s done.

fn main() {
    let s = String::from("hello");  // s comes into scope

    takes_ownership(s);             // s's value moves into the function...
                                    // ... and so is no longer valid here

    let x = 5;                      // x comes into scope

    makes_copy(x);                  // x would move into the function,
                                    // but i32 is Copy, so it's okay to still
                                    // use x afterward

} // Here, x goes out of scope, then s. But because s's value was moved, nothing
  // special happens.

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{}", some_string);
} // Here, some_string goes out of scope and `drop` is called. The backing
  // memory is freed.

fn makes_copy(some_integer: i32) { // some_integer comes into scope
    println!("{}", some_integer);
} // Here, some_integer goes out of scope. Nothing special happens.

When you return a value from a function its ownership is transferred to where ever your return value is bound. So if a name() returned a string and we bind it let name = name() then the return value is now owned by name.

fn calc_len(s: String) -> (String, usize) {
	let length = s.len();
	(s, length) // multiple vaues returned
}

References and Borrowing

Instead of transferring ownership we can allow a function to borrow a value for use, we do this by providing a reference or pointer which points to the addess we can follow to find the data.

fn main() {
    let s1 = String::from("hello");

    let len = calculate_length(&s1);

    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    s.len()
}

The & denotes an address (pointer/ reference) allowing us to reference a value rather than taking ownership of it. We can use * to dereference a pointer. References are dropped when out of scope. When we pass a reference we don’t need to return it in order to give back ownership since we never had it in the first place.

This idea of creating a reference is called borrowing. We can’t modify a borrowed value but we can use it.

Mutable References

fn main() {
	let mut s = String::from("Hello");
	change(&mut s);
}

fn change(some_string: &mut String) {
	some_string:push_str(", World");
}

If we want to modify a reference we have to pass a mutable reference. You can only ever have 1 mutable reference to a value at a time (per scope). We also cannot have a mutable reference while we have an immutable one to the same value.

Rust doesn’t allow danging pointers it will say: text this function's return type contains a borrowed value, but there is no value for it to be borrowed from. Rust will clean up dangling pointers.

The Slice Type

Slices let you referece contiguous sequences of elements in a collection rather than a whole collection. A slice is a reference so it has no ownership.

If we are writing a function to find some data or an index, you data will be out of sync with your string because the data is separate from the string itself. So when the string changes the index will and you get issues. To fix this we can use string slices. A slice is a reference to part of a string:

let s = String::from("hello world");

let hello = &s[0..5];
let world = &s[6..11];

Rather than being a reference to the whole string, hello is a reference to part of the String; specified by the [startIndex..endIndex]. Internally this stores the length of the string and the string position. World is a slice that contains a pointer to the byte at index 6, with a length of 5. If the first index is 0, or last == length you can drop the number:

let hello = &s[..5];
let slice = &s[3..];
let wholeString = &s[..]

We can now write fist word like this:

fn first_word(s: &String) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    return &s[..];
}

Now if you clear the string the compiler will error because the string change: console cannot borrow 's' as mutable because it is also borrowed as immutable. We can also use a slice in the parameter of the function, this will let us pass in &str or a &String in to the function.

Slices also work in arrays, if we slice an array of numbers it will have a type of &[i32].

Structs

Strucs are similar to tuples but they have named related values.

struct User {
	active: bool,
	username: String,
	email: String, 
	sign_in_count: u64;
}

We can define data on the struct:

fn main() {
    let user1 = User {
        active: true,
        username: String::from("someusername123"),
        email: String::from("someone@example.com"),
        sign_in_count: 1,
    };
	user1.email = String::from("someoneelse@example.com");
}

Struct instances have to be mutable; Rust doesn’t let you mark only certain fields as mutable. We normally construct a new instance using field init shorthand:

fn build_user(email: String, username: String) -> User {
	User {
		active: true,
		username,
		email,
		sign_in_count: 1,
	}
}

fn getFractionSignin(user: &User) -> f32 {
	let total = query_total_users();
	user.sign_in_count / total
}

fn main() {
	let user1: User = build_user("email@company.com", "steven");
	// borrow user1
	let signinFraction: f32 = getFractionSignin(&user1);
	println!("{signinFraction * 100*}%");
}

We can use existing Structs to make new structs:

fn main(){
	let user2 = User {
		email: String::from("another@example.com"),
		..user1
	}
}

Structs don’t have to use named fields:

struct Colour(i32, i32, i32);
struct Point(i32, i32, i32);

fn main() {
	let black = Colour(0,0,0);
	let origin = Point(0,0,0);
}

We can also define unit-type structs with no data:

struct AlwaysEqual;

fn main() {
    let subject = AlwaysEqual;
}

If we want to be able to print the struct we need to derive the println macro:

#[derive(Debug)]
struc Rectangle {
	width: u32,
	height: u32,
};

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };
	// using {:#?} instead will make the output nicer. 
    println!("rect1 is {:?}", rect1);
}

Methods

Methods are functions bound to a struct, enum or trait. It requires an instance to self.

#[derive(Debug)]
struct Rectangle {
    width: u32,
    height: u32,
}

// implementation for Rectangle
impl Rectangle {
	// &self is a borrow of the Self instance of Rectangle
    fn area(&self) -> u32 {
        self.width * self.height
    }
    fn can_hold(&self, other: &Rectangle) -> bool {
        self.width > other.width && self.height > other.height
    }
}

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!(
        "The area of the rectangle is {} square pixels.",
        rect1.area()
    );
}

You can if you really wanted to call an implemented function the same name as an attribute.

C++ and C -> operator

In C you can call methods in two ways:

Directly from an Object: Object.method()
From a pointer to an Object dereferencing it first: Object->method()
- Used of Object is a pointer similar to: (*Object).method()

Rust has automatic referencing and dereferencing for method calls; it will automatically add &, &mut or * so that the object matches the signture of the method. E.g. the following are the same:

p1.distance(&p2);
(&p1).distance(&p2);

Given the object p1 rust implies whether we need to read (&self), mutate (&mut self) or consuming a value (self). Functions defined within impl are associated functions because they don’t have self as a parameter and are associated with the type after named after impl.

Let’s say we define a constructor:

impl Rectangle {
    fn New(size: u32) -> Self {
        Self {
            width: size,
            height: size,
        }
    }
}

fn main() {
	Rectangle::New(7);
}

We call associated functions using ’::’ along with the struct name. You couls also split the implementation up and have 1 impl block per method.

Enums

Define a type by enumerating all possible variants. For example we can define the two possible states of an IP address:

enum IpAddrKind {
    V4,
    V6,
}

Create instances:

let four = IpAddrKind::V4;
let six = IpAddrKind::V6;

fn route(ip_kind: IpAddrKind) {}

route(IpAddrKind::V4);

We could represent an Ip address using a struct with an IpKind parameter but using an enum is more concise since we can add data to them:

enum IpAddr {
	V4(String),
	V6(String),
}

let home = IpAddr::V4(String::from("127.0.0.1"));

Enums can hold a range of types:

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

If we did this using a struct we would need 4 separate structs.

Optional Values In most languages this is done via an implementation of null rust doesn’t have a null implementation because it has learned from other languages and saw that null normally causes bugs. Instead we use an Option enum which id defined in std like so:

enum Option<T> {
	None,
	Some(T),
}

We can use the Option<T> type to show an absence of values:

let some_number = Some(5);
let no_number: Option<i32> = None;

The type of some_number is Option<char> this is inferred because we specified a value in Some. Since no_number has no value rust can’t infer it, we have to give it a type.

We cannot add i8 and Option<i8> because there is a chance there is no value. So we have to convert Option<T> to type T first. If we want to get the value in Some we can call the .unwrap() method.

Enums are very handy in defining match conditions:

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter,
}

fn value_in_cents(coin: Coin) -> u8 {
    match coin {
        Coin::Penny => 1,
        Coin::Nickel => 5,
        Coin::Dime => 10,
        Coin::Quarter => 25,
    }
}

also very useful for matching the Option type:

   fn plus_one(x: Option<i32>) -> Option<i32> {
        match x {
            None => None,
            Some(i) => Some(i + 1),
        }
    }

    let five = Some(5);
    let six = plus_one(five);
    let none = plus_one(None);

Matches are exhaustive and thus must cover all possibilities. We can use _ as a match pattern to catch everything else we haven’t specified.

Often if we only want to match one pattern and ignore the rest we would do this:

    let config_max = Some(3u8);
    match config_max {
        Some(max) => println!("The maximum is configured to be {}", max),
        _ => (),
    }

but we can make this cleaner using an if let binding:

    let config_max = Some(3u8);
    if let Some(max) = config_max {
        println!("The maximum is configured to be {}", max);
    }

if let takes a pattern and an expression separated by an equal sign.

Crates and Modules

Packages: A Cargo feature that lets you build, test, and share crates
Crates: A tree of modules that produces a library or executable
Modules and use: Let you control the organization, scope, and privacy of paths
Paths: A way of naming an item, such as a struct, function, or module

A Crate is a single rust file. Crates contain modules. A crate can either be a library or binary crate. A Binary crate is one you compile to an executable, each one must have a main function. Library creates don’t have main functions and don’t compile to an executable. They define functionality which will be shared between projects. Crate = Library.

Compiler starts at the root src/lib.rs for a library crate or src/main.rs for a binary crate.
Declare a module using the mod keyword. Let’s say you make a file called garden.rs in the main file you can declare it as a module mod garden;
If you declare a module a file that isn’t the root is it a submodule. If we declare a mod vegetables in the garden.rs file then vegetables will be a submodule.
Use functions from a module like this: crate::garden::vegetables::Asparagus
Use pub mod to declare a public module because by default all modules are private.
The use keyword create a shortcut to long paths like crate::garden::vegetables::Asparagus we can instead put use crate::graden::vegetables::Asparagus at the top of the file and just use the Asparagus function. https://doc.rust-lang.org/stable/book/ch07-02-defining-modules-to-control-scope-and-privacy.html

Common Collections

Vectors

The Vec<T> type allows you to store more than one value in a single data structure that puts all the values next to each other in memory. They only stores values of one type. We can make an empty vector:

let v: Vec<i32> = Vec::new();

However we can also use a macro for making a vector of values:

// rust infers the type here
let mut v = vec![1,2,3];

// adding values:
v.push(4);
v.push(5);

Accessing Data

let v = vec![1,2,3,4,5];

let third: &i32 = &v[2];

// or using get
let third: Option<&i32> = v.get(2);
match third {
	Some(third) => println!("Third value is: {third}."),
	None => println!("there is no 3rd element."),
}

Here we are using & to get a reference to the element. Using [] for access will cause the program to panic when the index is out of range. If you use get it will return None if it is out of range and will not panic. If index is uncertain use get.

let mut v = vec![1, 2, 3, 4, 5];

let first = &v[0];

v.push(6);

println!("The first element is: {first}");

The borrow checker enforces ownership and reference rules. The above is invalid code because when we borrow v in the first variable it creates an immutable reference, and we aren’t allowed to hold both a mutable and an immutable reference to v.

This doesn’t work because adding a new element might require allocating new memory and copying the data since the elements are next to each other in memory. If Rust did that the first element will be pointing to deallocated memory.

Iteration:

let v = vec![100, 32, 57];
for i in &v {
	println!("{i}");
}

If you want to make changes you loop a mutable:

    let mut v = vec![100, 32, 57];
    for i in &mut v {
        *i += 50;
    }

To update the value that the reference refers to we have to dereference it first. We can’t add elements to the vector within the loop because the reference that the for loop holds prevents simultaneous modification. Instead we would have to loop over the indices.

If we really wanted to store different types in the vector then we could use an enum:

    enum SpreadsheetCell {
        Int(i32),
        Float(f64),
        Text(String),
    }

    let row = vec![
        SpreadsheetCell::Int(3),
        SpreadsheetCell::Text(String::from("blue")),
        SpreadsheetCell::Float(10.12),
    ];

Vector memory is freed when it goes out of scope.

String

Strings are a collection of bytes. In Rust the only string type is the string slice &str which is a reference to UTF-8 encoded string data. a string literal is stored in the program binary and are therefore string slices. The String type is implemented in the Rust standard library as a wrapper around a vector of bytes. Lots of the same methods are available on a String that are on a Vector e.g. ::new().

// make a string 3 ways:
let data = "initial contents";
let s = data.to_string();

let s = "initial contents".to_string();

let s = String::from("initial contents");

Like a vector a string can grow in size you can push data into it, or use the + operator or the format! macro for concatenation. We can append a string slice:

let mut s = String::from("foo");
s.push_str("bar"); // foobar
// the push method takes 1 character
s.push('!'); //foobar!

Or using the + operator:

    let s1 = String::from("Hello, ");
    let s2 = String::from("world!");
    let s3 = s1 + &s2;

s1 is moved so cannot be used after that point but s2 was borrowed as a reference so can be used. The + operator is implemented like this:

fn add(self, s: &str) -> String {

So we have to make one of the strings a reference. s3 will take ownership of s1 and append s2. If we need to join too many strings it’s cleaner to use format!:

let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");

let s = format!("{s1}-{s2}-{s3}");

Rust strings do not support indexing because it is stored as a vector of bytes so indexing it will return the byte representation at that position. For non Latin script characters may take more than one byte so if you get the length rust will probably print more than you’d expect.

let hello = "Здравствуйте";

let s = &hello[0..4]; // `Зд`

Each character in Cyrillic is 2 bytes so indexing the first 4 bytes gives you the first 2 characters. If we want characters rather than bytes we have to be explicit and say hello.chars();.

HashMaps

The type def for a Hash Table is HashMap<K, V> . Basic syntax:

use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

Accessing Values:

use std::collections::HashMap;
let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

let team_name = String::from("Blue");
// get returns `Option<&V>`
let score = scores.get(&team_name).copied().unwrap_or(0);

.copied() converts Option<&i32> to Option<i32> then unwrap or to set score to 0 if there is no score for that value.

Loops:

for (key, value) in &scores {
	println!("{key}: {value}");
}

For types which don’t implement the Copy trait like i32 the values are copied into the hash map. For owned values like a String values are moved and the hash map will become the owner. We can’t insert a reference into the hash map because reference has to be valid for at least as long as the hash map is valid otherwise if it is dereferenced too early and the value will be lost from the hashmap.

Updating We could override the values:

    use std::collections::HashMap;

    let mut scores = HashMap::new();

    scores.insert(String::from("Blue"), 10);
    scores.insert(String::from("Blue"), 25);

    println!("{:?}", scores);

We can also use Or_insert to only add a value if a key doesn’t already exist:

 use std::collections::HashMap;

    let mut scores = HashMap::new();
    scores.insert(String::from("Blue"), 10);

    scores.entry(String::from("Yellow")).or_insert(50);
    scores.entry(String::from("Blue")).or_insert(50);

    println!("{:?}", scores);

Counting Frequency:

    use std::collections::HashMap;

    let text = "hello world wonderful world";
    let mut map = HashMap::new();

    for word in text.split_whitespace() {
        let count = map.entry(word).or_insert(0);
        *count += 1;
    }
    println!("{:?}", map);

We have to dereference count in order to update it. Rust uses the SipHash hashing function because it provides DoS resistance.

Error Handling

Rust errors are grouped into recoverable (reported) and unrecoverable (stop execution). Unlike most languages rust doesn’t have exceptions instead it has the Result<T, E> type for recoverable error and then the panic! macro stops execution.

In C if you read data from an array at a position that is beyond the array you might get a return of something that’s at that memory location, which doesn’t belong to that structure, this is a buffer overread.

If we run our program using RUST_BACKTRACE=1 cargo run we will get a trackback when the program panics.

The Result type is an Enum which looks like this:

enum Result<T, E> {
	Ok(T),
	Err(E),
}

Lots of functions return this in case they fail for instance file reading:

use std::fs::File;

fn main() {
	let greeting_file_result = File::open("hello.txt");

	let greetings_file = match greeting_file_result {
		Ok(file) => file,
		Err(error) => panic!("Problem Opening File: {:?}", error),
	};
}

We can also match specific errors:

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let greeting_file_result = File::open("hello.txt");

    let greeting_file = match greeting_file_result {
        Ok(file) => file,
        Err(error) => match error.kind() {
            ErrorKind::NotFound => match File::create("hello.txt") {
                Ok(fc) => fc,
                Err(e) => panic!("Problem creating the file: {:?}", e),
            },
            other_error => {
                panic!("Problem opening the file: {:?}", other_error);
            }
        },
    };
}

We can rewrite this using a closure to make it cleaner:

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let greeting_file = File::open("hello.txt").unwrap_or_else(|error| {
        if error.kind() == ErrorKind::NotFound {
            File::create("hello.txt").unwrap_or_else(|error| {
                panic!("Problem creating the file: {:?}", error);
            })
        } else {
            panic!("Problem opening the file: {:?}", error);
        }
    });
}

Shortcuts unwrap and expect are shortcuts to panic. unwrap is a shortcut to panic it is implemented just like match. If the Result is Ok we return the value inside Ok and if the result is Err we call panic. expect like unwrap but it takes a message parameter which is passed to panic.

Error Propagation

If a function is running code which has the potential to error instead of handing the error there you could return the error and let the calling function deal with it. This is called propagating the error.

use std::fs::File;
use std::io::{self, Read};

fn read_username_from_file() -> Result<String, io::Error> {
    let username_file_result = File::open("hello.txt");

    let mut username_file = match username_file_result {
        Ok(file) => file,
        Err(e) => return Err(e),
    };

    let mut username = String::new();

    match username_file.read_to_string(&mut username) {
        Ok(_) => Ok(username),
        Err(e) => Err(e),
    }
}

Here we try to read a file and if there is an error we return the error to whatever function called the read_username_from_file function.

The return type of the function is Result<String, io:error> those are concrete types which fill the T and E type parameters.
- With no error we return Ok(String)
- if errors then the calling code with get the Err (used io::Error because that is the return type of File::open and read_to_string.
If successful then the the file becomes the value of username_file and the function keeps going.
- A new string is created and read_to_string is called puts the file contents on the username.
  - This also returns a result so we match it.
  - Here we don’t need the return keyword since it’s the last expression of the function.
If Err then instead of panicking we use the return keyword to return early. The thing that calls read_username_from_file then has to deal with the errors.

This pattern is so common that Rust has a shortcut for it:

use std::fs::File;
use std::io::{self, Read};

fn read_username_from_file() -> Result<String, io::Error> {
    let mut username_file = File::open("hello.txt")?;
    let mut username = String::new();
    username_file.read_to_string(&mut username)?;
    Ok(username)
}

This will do almost the same as the code before.

error values that have the ? operator called on them go through the from function, defined in the From trait in the standard library, which is used to convert values from one type into another.

? will also convert any error received to the error type specified in the return type via the call to from.

We could also chain things to make that even shorter:

use std::fs::File;
use std::io::{self, Read};

fn read_username_from_file() -> Result<String, io::Error> {
    let mut username = String::new();

    File::open("hello.txt")?.read_to_string(&mut username)?;

    Ok(username)
}

This is also a very common operation so we have a function for it in the fs module of the standard library:

use std::fs;
use std::io;

fn read_username_from_file() -> Result<String, io::Error> {
    fs::read_to_string("hello.txt")
}

The ? operator can only be used in functions whose return type is compatible with the value the ? is used on.

We can use a Option return type
Or a Result return type

Use Panic! when you are writing examples to demonstrate a concept, prototype or in tests. Also when calling external code which is out of your control. When you expect a failure it is more appropriate to return to Result.

Generic Types, Traits and Lifetimes.

We use generics to create function signatures or structs which can be used on many different concrete types.

fn largest<T>(list: &[T]) -> &T {
    let mut largest = &list[0];

    for item in list {
        if item > largest {
            largest = item;
        }
    }

    largest
}

Instead of defining a largest function for all different types we can use a generic type to make it work for many different types. This function won’t work right now since not all types implement comparisons instead we have to restrict to those that do:

fn largest<T: std::cmp::PartialOrd>(list: &[T]) -> &T {}

A generic struct could look like this:

struct Point<T> {
    x: T,
    y: T,
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
}

Note: Since the type is T both x and y have to be the same type. If you want them to be different you have to define another type parameter:

struct Point<T, U> {
    x: T,
    y: U,
}

fn main() {
    let both_integer = Point { x: 5, y: 10 };
    let both_float = Point { x: 1.0, y: 4.0 };
    let integer_and_float = Point { x: 5, y: 4.0 };
}

enums can also have generics like in Result<T, E> and Option<T>.

Similarly in method definitions:

struct Point<T> {
    x: T,
    y: T,
}

impl<T> Point<T> {
    fn x(&self) -> &T {
        &self.x
    }
}

fn main() {
    let p = Point { x: 5, y: 10 };
    println!("p.x = {}", p.x());
}

To optimise this rust uses monomorphization which turns the generics into concrete types based on how you call them.

Traits

A trait defines shared behaviour between different types. Traits define shared behaviour in an abstract way using generics giving a certain behaviour to a certain type.

A type’s behaviour consists of methods we call on the type, different types often share the same behaviour. Traits are a way to define this shared behaviour.

pub trait Summary {
	fn summarize(&self) -> String;
}

In this case we use a semi colon at the end of the function so that each type which implements this has to define its own definition for summarize. Implementing a Trait on a type

pub struct NewsArticle {
    pub headline: String,
    pub location: String,
    pub author: String,
    pub content: String,
}

impl Summary for NewsArticle {
    fn summarize(&self) -> String {
        format!("{}, by {} ({})", self.headline, self.author, self.location)
    }
}

The for keyword is used to implement a trait on a type. Now a NewsArticle or any other type that implements this can call it like a method instance.summarize().

Traits can also have default implementations:

pub trait Summary {
    fn summarize(&self) -> String {
        String::from("(Read more...)")
    }
}

Then to use the default implementation to summarize instances of NewsArticle, we specify an empty impl block with impl Summary for NewsArticle {}.

Calls to summarize with no implementation will call the default implementation.

Default implementations can call methods which don’t have default implementations.

pub trait Summary {
    fn summarize_author(&self) -> String;

    fn summarize(&self) -> String {
        format!("(Read more from {}...)", self.summarize_author())
    }
}

To use this we only need to define summarize_author

impl Summary for Tweet {
    fn summarize_author(&self) -> String {
        format!("@{}", self.username)
    }
}

Traits can be parameters Traits can be used to define functions which can accept many different types.

pub fn notify(item: &impl Summary) {
    println!("Breaking news! {}", item.summarize());
}

Here the function takes an item but the only valid items are the ones which implement Summary. The above code is syntax sugar for creating a boundary:

pub fn notify<T: Summary>(item: &T) {
	println!("Breaking news! {}", item.summarize());
}

This is more verbose we place trait bounds on the type with a declaration of a generic type. This is useful when both types implement the same Type:

pub fn notify(item1: &impl Summary, item2: &impl Summary) {

pub fn notify<T: Summary>(item1: &T, item2: &T) {

We can also define multiple trait bounds

pub fn notify(item: &(impl Summary + Display)) {

pub fn notify<T: Summary + Display>(item: &T) {

Using Where to clean up traits

fn some_function<T: Display + Clone, U: Clone + Debug>(t: &T, u: &U) -> i32 {

we can use a where clause, like this:

fn some_function<T, U>(t: &T, u: &U) -> i32
where
    T: Display + Clone,
    U: Clone + Debug,
{

Return types can also implement traits:

fn returns_summarizable() -> impl Summary {
    Tweet {
        username: String::from("horse_ebooks"),
        content: String::from(
            "of course, as you probably already know, people",
        ),
        reply: false,
        retweet: false,
    }
}

Lifetimes

Lifetimes are generics which ensure that a reference is valid for as long as we need it to be. The lifetime is normally inferred (while the variable is in scope it’s valid).

A Dangling Reference

fn main() {
    let r;

    {
        let x = 5;
        r = &x;
    }

    println!("r: {}", r);
}

This is code snippet we are attempting to print r which was set to a reference in the inner scope. The value was dropped when it left the scope and so rust will error saying that the borrowed value didn’t live long enough to be printed (since it was dropped). Otherwise r would be referencing deallocated memory.

The Rust borrow checker compares scopes to determine if borrowed values are valid. Internally rust will annotate life times for all variables lets call then a’ and b’ it will then compare the lifetimes of both at compile time and error if b’ if shorter than a’ because the subject of the reference doesn’t live as long as the reference.

Data has to have a longer lifetime than its reference otherwise the reference will point to nothing and will dangle.

When we write a function like this:

fn longest(x: &str, y: &str) -> &str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Where we want to return the longest of two slices, Rust will error because we didn’t specify if the return slice is referring to x or y.

Lifetimes are denoted using an ’ with a letter. Normally a to show the first lifetime. On their own they have no meaning they are only there to tell rust how different things relate to each other.

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

This tells rust that both x and y have to same lifetime ‘a and that the slice returned will live at least as long as the parameters.

We can use references in a struct but we have to specify a lifetime

struct ImportantExcerpt<'a> {
    part: &'a str,
}

fn main() {
    let novel = String::from("Call me Ishmael. Some years ago...");
    let first_sentence = novel.split('.').next().expect("Could not find a '.'");
    let i = ImportantExcerpt {
        part: first_sentence,
    };
}

This makes sure the struct can’t outlive the reference.

Static lifetimes are ones which live for the entire duration of the program

let s: &'static str = "I have a static lifetime.";

All string literals have this by default.

Testing

We define tests like this:

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        let result = 2 + 2;
        assert_eq!(result, 4);
    }
}

We can run cargo test to run our tests:

assert_eq!
assert_ne!
assert Tests can have custom outputs:

    #[test]
    fn greeting_contains_name() {
        let result = greeting("Carol");
        assert!(
            result.contains("Carol"),
            "Greeting did not contain name, value was `{}`",
            result
        );
    }

Or we can check of code panics:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    #[should_panic]
    fn greater_than_100() {
        Guess::new(200);
    }
}

Functional Programming in Rust

A Closure is an anonymous function which you can save to a variable or pass it as an argument. Unlike a function closures capture values from the scope in which they are defined.

#[derive(Debug, PartialEq, Copy, Clone)]
enum ShirtColor {
    Red,
    Blue,
}

struct Inventory {
    shirts: Vec<ShirtColor>,
}

impl Inventory {
    fn giveaway(&self, user_preference: Option<ShirtColor>) -> ShirtColor {
        user_preference.unwrap_or_else(|| self.most_stocked())
    }
}

Assume that the function most_stocked is also defined on the implementation. The unwrap_or_else method takes a closure as a parameter. The closure in this case takes no parameters and just calls the most stocked function.

Closures don’t require you to annotate types. This is because closures are normally in a narrow context and so the compiler can infer the types.

    let expensive_closure = |num: u32| -> u32 {
        println!("calculating slowly...");
        thread::sleep(Duration::from_secs(2));
        num
    };

We could however if we wanted to, define the types explicitly.

 let example_closure = |x| x;

    let s = example_closure(String::from("hello"));
    let n = example_closure(5);

After s the complier will infer that x is a string and so n will throw an error asking you to cast to a string.

fn main() {
    let list = vec![1, 2, 3];
    println!("Before defining closure: {:?}", list);

    let only_borrows = || println!("From closure: {:?}", list);

    println!("Before calling closure: {:?}", list);
    only_borrows();
    println!("After calling closure: {:?}", list);
}

Closures can automatically borrow data from the current scope. Because can also have multiple immutable references of list list is still accessible before and after the closure is called.

Iterators

Iterators allow us to perform a task on a sequence of items in turn. Rust iterators are lazy they have no effect until you call the methods that consume the iterator.

This doesn’t do anything useful:

let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();

We have to use to:

let v1 = vec![1, 2, 3]; 
let v1_iter = v1.iter();

for val in v1_iter { 
	println!("Got: {}", val);
}

Iterators automatically get the next value.

pub trait Iterator { 
	// assoiated type
	type Item;
	
	fn next(&mut self) -> Option<Self::Item>; 
	
	// methods with default implementations elided
}

Some Methods produce iterators:

let v1: Vec<i32> = vec![1, 2, 3]; 
v1.iter().map(|x| x + 1);

However this breaks because the iterator isn’t consumed. So fix this we use collect to collect the data and consume the iterator:

let v1: Vec<i32> = vec![1, 2, 3];
let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();

Extra Patterns

let x = 1; 
// OR 
match x {
	1 | 2 => println!("one or two"),
	3 => println!("three"),
	_ => println!("anything"),
}

Inclusive range:

let x = 5; 

match x { 
	1..=5 => println!("one through five"),
	_ => println!("something else"),
}

Since chars are often casted to integers we can:

let x = 'c';
match x { 
	'a'..='j' => println!("early ASCII letter"),
	'k'..='z' => println!("late ASCII letter"),
	_ => println!("something else"),
}

Destructuring

struct Point { 
	x: i32, 
	y: i32,
}

fn main() {
	let p = Point { x: 0, y: 7 };
	let Point { x: a, y: b } = p; 
	assert_eq!(0, a); assert_eq!(7, b);
}

You don’t have to name the destructured elements:

fn main() { 
	let p = Point { x: 0, y: 7 };
	let Point { x, y } = p; 
	assert_eq!(0, x); 
	assert_eq!(7, y); 
	
	match p { 
		Point { x, y: 0 } => println!("On the x axis at {x}"), 
		Point { x: 0, y } => println!("On the y axis at {y}"),
		Point { x, y } => { println!("On neither axis: ({x}, {y})");
	}
}