Ikeh Akinyemi Ikeh Akinyemi is a software engineer based in Rivers State, Nigeria. He’s passionate about learning pure and applied mathematics concepts, open source, and software engineering.

Learn how to read a file in Rust

10 min read 3036

How to read a file in Rust

Working with files can be a finicky but inevitable part of software engineering, and as a developer, you will often need to load information from external sources to use in your projects.

In this blog post, you’ll learn how to read files in Rust. Specifically, you’ll learn how to read a JSON file, a YAML file, and a TOML file in the Rust programming language.

Jump ahead:

Accessing a file

To start, we’ll first need to create a sample file that we’ll access through our project. You can either manually create the file or you can use the write() function provided by the Rust standard library.

Let’s bootstrap a Rust starter project with the following command on the terminal:

cargo new sample_project

Next, create a new file inside the root directory of our project where we’ll have our source code.

This file will be called info.txt and will contain just a small random bit of text, like so:

// info.txt
Check out more Rust articles on LogRocket Blog

Reading the file as a string

First, we need to import the file module with a use statement. Rust offers a standard library std crate that provides the fs module with the file read and write operations.

use std::fs;
fn main() {
    let file_contents = fs::read_to_string("info.txt")
        .expect("LogRocket: Should have been able to read the file");
    println!("info.txt context =\n{file_contents}");
}

With the above code snippet, we open and read the file located at the path value passed as an argument in the read_to_string function in the fs module. In addition, we have to specify what happens if, for any reason, the file can’t open; for example, there’s a permission error or something similar. In the expect function, we pass the text to be displayed if the file can’t open.

Using the cargo run command, the above program will be compiled and run, and then output the content of the file we created previously. For this example, it will have the same value as the content of info.txt.

Reading a file as a vector

Reading a file as a vector can be useful if you want to store the contents of the file in memory for easy access or modification. It can also be useful for reading binary files, as a vector of bytes can represent the data more accurately than a string. Reading a file as a vector allows you to read the entire file into memory at once, rather than reading it piece by piece. This can be more convenient if you need to access the file multiple times or if you want to perform operations on the entire file.

To read a file as a vector, you can use the read_to_end method of the Read trait:

// Rust
use std::fs::File;
use std::io::Read;

fn main() -> std::io::Result<()> {
    let mut file = File::open("info.txt")?;
    let mut contents = Vec::new();
    file.read_to_end(&mut contents)?;

    println!("File contents: {:?}", contents);

    Ok(())
}

This code opens a file called "info.txt" and reads it into a vector of bytes called contents. The read_to_end method reads the file from the current position to the end of the file and appends the data to the end of the contents vector.

Reading a file with a buffer

Reading a Rust file with a buffer can be more efficient than reading the entire file at once because it allows the program to process the data in chunks. This can be particularly useful for large files that may not fit in memory in their entirety.

To read a file using buffer, you can use the BufReader struct and the BufRead trait:

// Rust
use std::fs::File;
use std::io::{BufReader, BufRead};

fn main() -> std::io::Result<()> {
    let file = File::open("info.txt")?;
    let reader = BufReader::new(file);

    for line in reader.lines() {
        let line = line?;
        println!("{}", line);
    }

    Ok(())
}

This code opens a file called "info.txt" and creates a BufReader to read it line by line. The BufReader reads the file in chunks (or “buffers”) rather than reading it all at once, which can be more efficient for large files.



Handling a file I/O error in Rust

Wondering how to handle an error when a Rust file cannot be opened or read? You can use the std::io::Result type and the ? operator. In the examples above, the ? operator is used to propagate any errors that occur when opening or reading the file. If an error occurs, it will short-circuit the execution of the function and return the error to the caller.

Here’s an example of handling a specific I/O error:

// Rust
use std::fs::File;
use std::io::Read;

fn main() -> std::io::Result<()> {
    let mut file = match File::open("info.txt") {
        Ok(file) => file,
        Err(error) => {
            match error.kind() {
                std::io::ErrorKind::NotFound => {
                    println!("File not found");
                    return Ok(());
                }
                _ => return Err(error),
            }
        }
    };
    let mut contents = Vec::new();
    file.read_to_end(&mut contents)?;

    println!("File contents: {:?}", contents);

    Ok(())
}

In this example, the code tries to open the file "info.txt". If the file is not found, it prints a message and returns Ok(()). If any other error occurs, it returns the error to the caller. If the file is successfully opened, it reads the contents of the file into a vector as before.

The Serde framework

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically. For this section of the article, we will make use of the serde crate to parse a JSON file and a TOML file.

The basic advantage of the serde library is that it allows you to directly parse wire data into Rust structs that match the types defined within our source code. This way, your project knows the expected type of every incoming data at compile time of the source code.

Reading a JSON file

The JSON format is a popular data format for storing complex data. It’s the predominant data format amongst the common data formats used to exchange wire data on the web. It is widely used across JavaScript projects.

We can approach parsing of JSON data in Rust through a statically-typed method and a dynamically-typed method.

The dynamically-typed method is best suited for cases where you’re uncertain of the format of the JSON data against your predefined data structure in your source code, while the statically-typed method is used when you’re certain of the format of the JSON data.

To get started, you must install all the required dependencies.

In the Cargo.toml file, we’ll first add the serde and the serde_json crates as the dependencies. In addition to this, make sure the optional derive feature is enabled, which will help us generate the code for the (de)serialization.

//Cargo.toml
[dependencies]
serde = { version = 1.0, features = [“derived”] }
serde_json = "1.0"

Parsing JSON dynamically

First, we write a use declaration to import the serde_json crate. The Value enum is part of the serde_json crate, which represents any valid JSON value — it could be a string, null, boolean, array, etc.

Inside the root directory, we’ll create a .json file to store arbitrary JSON data, and we’ll read and parse the data to a valid data structure defined within our source code. Create a data folder, then create a sales.json file and update it with this JSON data.

Now, we have the wire data, we can update our main.rs file using the serde_json crate to write the code that will parse the JSON data:

use serde_json::Value;
use std::fs;
fn main() {
    let sales_and_products = {
        let file_content = fs::read_to_string("./data/sales.json").expect("LogRocket: error reading file");
        serde_json::from_str::<Value>(&file_content).expect("LogRocket: error serializing to JSON")
    };
    println!("{:?}", serde_json::to_string_pretty(&sales_and_products).expect("LogRocket: error parsing to JSON"));
}

In the above code snippet, we hardcoded the path to the sales.json file. Then, using serde_json, we provide (de)serialization support for the JSON data format.

The from_str takes as an argument a contiguous slice of bytes and it deserializes an instance of the type Value from it, according to the rules of the JSON format. You can inspect the Value type to learn more about its (de)serialization.

This is the output from running the code snippet:

"{\n  \"products\": [\n    {\n      \"category\": \"fruit\",\n      \"id\": 591,\n      \"name\": \"orange\"\n    },\n    {\n      \"category\": \"furniture\",\n      \"id\": 190,\n      \"name\": \"chair\"\n    }\n  ],\n  \"sales\": [\n    {\n      \"date\": 1234527890,\n      \"id\": \"2020-7110\",\n      \"product_id\": 190,\n      \"quantity\": 2.0,\n      \"unit\": \"u.\"\n    },\n    {\n      \"date\": 1234567590,\n      \"id\": \"2020-2871\",\n      \"product_id\": 591,\n      \"quantity\": 2.14,\n      \"unit\": \"Kg\"\n    },\n    {\n      \"date\": 1234563890,\n      \"id\": \"2020-2583\",\n      \"product_id\": 190,\n      \"quantity\": 4.0,\n      \"unit\": \"u.\"\n    }\n  ]\n}"

In a real project, apart from displaying the output, we’ll want to access the different fields in the JSON data, manipulate the data, or even try to store the updated data in another file or the same file.

With this in mind, let’s try to access a field on the sales_and_products variable and update its data and possibly store it in another file:

use serde_json::{Number, Value};
// --snip--

fn main() {
    // --snip--
    if let Value::Number(quantity) = &sales_and_products\["sales"\][1]["quantity"] {
        sales_and_products\["sales"\][1]["quantity"] =
            Value::Number(Number::from_f64(quantity.as_f64().unwrap() + 3.5).unwrap());
    }
    fs::write(
        "./data/sales.json",
        serde_json::to_string_pretty(&sales_and_products).expect("LogRocket: error parsing to JSON"),
    )
    .expect("LogRocket: error writing to file");
}

In the above code snippet, we leverage the Value::Number variant to pattern match against the sales_and_products\["sales"\][1]["quantity"], which we expect to be a number value.

Using the from_f64 function on the Number struct, we converted the finite f64 value returned from the operation, quantity.as_f64().unwrap() + 3.5, back to a Number type, and then we stored it back into sales_and_products\["sales"\][1]["quantity"], updating its value.

(Note: Make sure to make the sales_and_products a mutable variable)

Then, using the write function and a file path as arguments, we create and update a file with the resulting value from calling the serde_json::to_string_pretty function. This resulting value will be the same as the value we previously output on the terminal, but well-formatted.

Parsing JSON statically

On the other hand, if we are absolutely certain of the structure of our JSON file, we can utilize a different method which involves the use of predefined data in our project.

This is the preferred approach against parsing the data dynamically. The static version’s source code declares three structs in the beginning:

use serde::{Deserialize, Serialize};
#[derive(Deserialize, Serialize, Debug)]
struct SalesAndProducts {
    products: Vec<Product>,
    sales: Vec<Sale>
}
#[derive(Deserialize, Serialize, Debug)]
struct Product {
    id: u32,
    category: String,
    name: String
}
#[derive(Deserialize, Serialize, Debug)]
struct Sale {
    id: String,
    product_id: u32,
    date: u64,
    quantity: f32,
    unit: String
}
fn main() {}

The first struct groups the inner data format contained within the sales and the products field within the JSON object. The remaining two structs define the expected data format stored within the outer fields of the JSON object.

To parse (read) JSON strings into the above structs, the Deserialize trait is necessary. And to format (that is, write) the above structs into a valid JSON data format, the Serialize trait must be present. Simply printing this struct on the terminal (debug trace) is where the Debug trait comes in handy.

The body of our main function should resemble the below code snippet:

use std::fs;
use std::io;
use serde::{Deserialize, Serialize};

// --snip--

fn main() -> Result<(), io::Error> {
    let mut sales_and_products: SalesAndProducts = {
        let data = fs::read_to_string("./data/sales.json").expect("LogRocket: error reading file");
        serde_json::from_str(&data).unwrap()
    };
    sales_and_products.sales[1].quantity += 1.5;
    fs::write("./data/sales.json", serde_json::to_string_pretty(&sales_and_products).unwrap())?;

    Ok(())
}

The function serde_json::from_str::SalesAndProducts is used to parse the JSON string. The code to increase the number of oranges sold then becomes very straightforward:

sales_and_products.sales[1].amount += 1.5

The rest of the source file compared to our dynamic approach remains the same.

Parsing TOML statically

For this section, we’ll focus on reading and parsing a TOML file. Most configuration files can be stored in TOML file formats, and due to its syntax semantics, it can easily be converted to a data structure like a dictionary or HashMap. Due to its semantics that strive to be concise, it is quite simple to read and write.

We will statically read and parse this TOML file. This means that we know the structure of our TOML file, and we will make use of predefined data in this section.

Our source code will contain the following structs that will map; correcting to the content of the TOML file when parsing it:

#![allow(dead_code)]
use serde::{Deserialize, Serialize};
use std::fs;

#[derive(Deserialize, Debug, Serialize)]
struct Input {
    xml_file: String,
    json_file: String,
}

#[derive(Deserialize, Debug, Serialize)]
struct Redis {
    host: String,
}

#[derive(Deserialize, Debug, Serialize)]
struct Sqlite {
    db_file: String
}

#[derive(Deserialize, Debug, Serialize)]
struct Postgresql {
    username: String,
    password: String,
    host: String,
    port: String,
    database: String
}

#[derive(Deserialize, Debug, Serialize)]
struct Config {
    input: Input,
    redis: Redis,
    sqlite: Sqlite,
    postgresql: Postgresql
}

fn main() {}

With a closer look at the above code snippet, you will see that we defined each struct to map to each table/header within our TOML file, and each field in the struct maps to the key/value pairs under the table/header.

Next, using the serde, serde_json, and toml crates, we will write the code that will read and parse the TOML file in the body of the main function.

// --snip--
fn main() {
    let config: Config = {
        let config_text = fs::read_to_string("./data/config.toml").expect("LogRocket: error reading file");
        toml::from_str(&config_text).expect("LogRocket: error reading stream")
    };
    println!("[postgresql].database: {}", config.postgresql.database); 
}

Output:

[postgresql].database: Rust2018

The differentiating part of the above code snippet is the toml::from_str function, which tries to parse the String value we read using the fs::read_to_string function. The toml::from_str function, using the Config struct as a guide, knows what to expect from the String value.

As a bonus, we can easily parse the above config variable to a JSON value using the below lines of code:

// --snip--
fn main() {
    // --snip--
    let _serialized = serde_json::to_string(&config).expect("LogRocket: error serializing to json");
    println!("{}", serialized);
}

Output:

{"input":{"xml_file":"../data/sales.xml","json_file":"../data/sales.json"},"redis":{"host":"localhost"},"sqlite":{"db_file":"../data/sales.db"},"postgresql":{"username":"postgres","password":"post","host":"localhost","port":"5432","database":"Rust2018"}}

Parsing YAML statically

Another popular configuration file used in projects is the YAML file format. For this section, we statically approach reading and parsing a YAML file in a Rust project. We’ll make use of this YAML file as the example for this section.

We’ll make use of the config crate to parse the YAML file, and as a first approach, we will define the necessary structs that will adequately parse to the content of our YAML file.

#[derive(serde::Deserialize)]
pub struct Settings {
    pub database: DatabaseSettings,
    pub application_port: u16,
}
#[derive(serde::Deserialize)]
pub struct DatabaseSettings {
    pub username: String,
    pub password: String,
    pub port: u16,
    pub host: String,
    pub database_name: String,
}
fn main() {}

Next, we’ll read and parse the YAML file within our main function.

// --snip--
fn main() -> Result<(), config::ConfigError> {
    let mut settings = config::Config::default(); // --> 1
      let Settings{database, application_port}: Settings = {
        settings.merge(config::File::with_name("configuration"))?; // --> 2
        settings.try_into()? // --> 3
      };

      println!("{}", database.connection_string());
      println!("{}", application_port);
      Ok(())
}

impl DatabaseSettings {
    pub fn connection_string(&self) -> String { // --> 4 
        format!(
            "postgres://{}:{}@{}:{}/{}",
            self.username, self.password, self.host, self.port, self.database_name
        )
    }
}

The above code snippet has more moving parts than previous examples, so let’s explain each part:

  1. We initialize the Config struct using the default values of the fields’ types. You can inspect the Config struct to see the default fields
  2. Using the config::File::with_name function, we search and locate a YAML file with the name, configuration. As defined by the docs, we merge in a configuration property source using the merge function on the Config struct
  3. Using the source from the previous line of code, we attempt to parse the YAML file content to the Settings struct we defined
  4. This is a utility function defined on the DatabaseSettings struct to format and return a Postgres connection string

A successful execution of the above example will output:

postgres://postgres:[email protected]:5432/newsletter
8000

Conclusion

Throughout this article, we explored how to read different files in Rust projects. The Rust standard library provides various methods for performing file operations, specifically read/write operations, and I hope this post has been useful in showing you how to read a file in Rust.

We also took the liberty of looking at the Serde crate and how it plays an important role in helping us parse different file formats like YAML, JSON, or TOML into data structures our Rust programs can understand.

We explored three popular file formats; YAML, JSON, and TOML. As a part of this article, you can explore Rust crates.io to discover other crates you can use to read/write configuration management files outside the scope of this post like INI, XML, and more in your next Rust project.

LogRocket: Full visibility into web frontends for Rust apps

Debugging Rust applications can be difficult, especially when users experience issues that are difficult to reproduce. If you’re interested in monitoring and tracking performance of your Rust apps, automatically surfacing errors, and tracking slow network requests and load time, try LogRocket. LogRocket Dashboard Free Trial Banner

LogRocket is like a DVR for web and mobile apps, recording literally everything that happens on your Rust app. Instead of guessing why problems happen, you can aggregate and report on what state your application was in when an issue occurred. LogRocket also monitors your app’s performance, reporting metrics like client CPU load, client memory usage, and more.

Modernize how you debug your Rust apps — start monitoring for free.

Ikeh Akinyemi Ikeh Akinyemi is a software engineer based in Rivers State, Nigeria. He’s passionate about learning pure and applied mathematics concepts, open source, and software engineering.

Leave a Reply