Part 4: Modularizing
Let’s take a small detour from the distributed systems-focused workflow here to
focus on modularizing our code. Right now we have a mammoth main()
function
that simply becomes spaghetti if
we want to add more functionality. Some of the more “standards-enjoyer”-like of
y’all might actually prefer limiting your code to 80 characters wide (it’s also
a goal that I have tried to maintain for the markdown files in this blog), which
you can already see might lead to funky indentations and nearly-unreadable code.
There is also another issue - having a single method (or entity for that matter)
leads to an anti-pattern. It starts handling everything, becoming something
called the God Object. While it may
make sense to use a God Object when you want efficiency - it is generally bad
practice to have a single entity conquer every problem. You might unknowingly
introduce dependencies and undefined contracts between “stages” of your God
Object, you might end up with variables like request_to_x
, response_from_x
,
request_to_y
, response_from_y_at_first
, etc. Then you might hit a bug
because there is a certain variable that you’re unintentionally reusing between
workflows. The issues can be endless!
So how do we fix this?⌗
What if we could neatly organize our code? We can simply have our main function say “Hey component X, do your thing”. Component X now spins off its workflow. Component X is responsible for its workflow, and any issues with Component X are limited to Component X.
This design helps parallelize a lot of development, at an (usually) acceptable cost of performance and complexity. For those working on kernels, OSes and high-performance code, the additional cost builds up over time, and it might not make sense to sacrifice performance for readability. Fortunately, most code does not require wringing every last nanosecond of performance at the cost of readability and mantainability.
Segue aside, using components also allows us to test them in isolation, also called “unit testing”. This helps design and test the exact contract (expected behavior) of that component. We shall explore this in a future article.
So, how do we define modules for our project?
Creating a network module⌗
Now we know that we don’t necessarily want our main function to handle the nitty
-gritties of creating a connection, sending a message to the socket, etc., we
can move the code into its own module, possibly using a struct
to encapsulate
related behavior. When we need to add functionality to our network module, we
simply add that functionality to the appropriate struct, test it out, and let
the caller defer the logic to an instance of the struct
. You know, simple OOP
stuff (if you ignore inheritance, which Rust does not support).
Our code shall therefore look like this:
[ Project ] --contains--> [ Modules ] --contain--> [ Structs ]
You can create a module in main.rs
if you wish, but that does not help in
making the file smaller, and is not scalable. Instead, we shall create a new
folder that holds this module, and access code from this module from main.rs
.
Structure⌗
Let’s design our module as follows. I renamed the speaker to transmitter, and I renamed the listener to receiver. Just like a traditional duplex communication system.
project_root
├── main.rs
└── network module (nw)
├── transmitter.rs
└── receiver.rs
This encapsulates our network module in its own folder, and keeps any changes we
need to make for the module within it. It also allows us to reuse file names (as
long as they stay in different folders) and avoid making our project root too
wide. With this approach, we will need to let the main function know what is
exported from these modules. To do that, we shall use a mod.rs
file, which
tells rust that this folder is a module, and contains the submodules we wish to
export from here, acting as an entrypoint for this module.
Exposing functionality⌗
Rust has a “private” by default approach to members and modules. This means that these entities are private and internal to their owner, unless explicitly marked public. This behavior makes any exposure of internal data intentional, similar to how rust has all variables immutable (const) unless explicitly specified to be mutable by the programmer. Least permissive by default, explicitly enforce permissibility if necessary.
So let’s go bottom-up. We shall define the struct, define its methods, make it visible to the module, and then make the module visible to the rest of the program.
The receiver struct⌗
We want to define a receiver which can listen on a specific address. Let’s
define it as below, in nw/receiver.rs
:
pub struct Receiver {
listening_address: String,
}
The pub
keyword means that this struct is visible to outside entities. Not
adding this keyword keeps the struct private to the module by default. Note how
listening_address
is not pub
. It’s private to prevent unexpected access to
the data member. If you have no need to access it directly, why should you risk
it being accessed?
Let’s move on to adding the methods for this. These will be defined using the
impl
keyword. This includes the constructor(s) for this struct.
We can define the constructor as:
impl Receiver {
pub fn new(address: String) -> Self {
Receiver {
listening_address: address,
}
}
}
This differs from traditional constructors we are familiar with in Java, Python
or C++ in that we cannot simply say Receiver()
. Technically, you could name
the function Receiver
, but the rust linter is going to complain as standards
require you to have it in lowercase. Plus, given how rust does not support
function overloading, an explicit constructor detailing its purpose is always
good practice.
This function does nothing complex: It takes a String input, creates a Receiver
struct entity with the listening_address
set to the input and returns it. This
differs from traditional constructors where the constructor is a part of the
actual object. In this case - the constructor is a static method that returns a
new instance of the struct.
Let’s define the actual behavior here, within impl Receiver { ... }
:
Remember to import the necessary entities!
pub async fn listen(&self) {
let listener = match TcpListener::bind(
self.listening_address.clone()).await {
Ok(v) => v,
Err(e) => {
panic!("Error in setting up listener: {:?}", e);
}
};
println!("Server is listening on {}", self.listening_address);
loop {
let (mut socket, inc_addr) = match listener.accept().await {
Ok(stream_and_addr) => { stream_and_addr },
Err(e) => {
panic!("Cannot listen at address! Error: {:?}", e);
}
};
println!("New connection from {}", inc_addr);
// Spawn a new task to handle the connection
tokio::spawn(async move {
let mut buf = vec![0; 1024];
loop {
// Read data from the socket as long as it comes
match socket.read(&mut buf).await {
Ok(0) => {
// Zero bytes implies closed connection
println!("Connection closed by {}", inc_addr);
return;
}
Ok(n) => {
// Echo the data back to the client with additional
// info
let clone = buf.clone();
let data = String::from_utf8_lossy(&clone[..n]);
println!("Received data: {}", data);
let returnable = format!("Returning: {}", data);
let return_size = returnable.len();
if let Err(e) = socket.write_all(
&returnable.as_bytes()[..return_size])
.await {
eprintln!(
"Failed to write to socket; error = {:?}",
e);
return;
}
// Clear the buffer
buf.fill(0);
}
Err(e) => {
// Could not read from socket
eprintln!(
"Failed to read from socket; error = {:?}", e);
return;
}
}
}
});
}
}
This functionality is very similar to how we handled it in main()
. However,
the main difference here is that we do not want to use await?
and may have to
define a Result<>
to be returned. Instead, we explicitly process the match
arm and either gracefully handle the situation, or panic. (As we develop our
code, we shall focus on removing panics as well for better reliability.) We pass
&self
as the first parameter because we want to make this method non-static,
tied to an instance of the struct.
We have also defined the function to be async
, letting us know that this
function shall not block the main runtime from executing in case it is stuck on
a waiting task.
The entire functionality is now within the context of the receiver. If we want
another component to use the receiver, all it needs to do is create a new struct
using Receiver::new()
and call Receiver::listen()
. The caller does not need
to know, or worry about the internal implementational details.
The transmitter struct⌗
The transmitter struct can also be created as follows, in nw/transmitter.rs
:
pub struct Transmitter {
destination: String,
payload: Vec<u8>
}
impl Transmitter {
pub fn new(destination: String, payload: Vec<u8>) -> Self {
Self { destination, payload}
}
pub async fn transmit(&self) {
let mut write_stream = match TcpStream::connect(
self.destination.as_str()).await {
Ok(val) => val,
Err(e) => {
eprintln!("Could not connect: {:?}", e);
return;
}
};
match write_stream.write_all(self.payload.as_slice()).await {
Ok(_) => {
println!("Wrote data");
}
Err(e) => {
eprintln!("Error in writing: {:?}", e);
return;
}
}
}
}
Again, we define a constructor and the necessary functionality, marking the
struct and the methods pub
to make them accessible outside the module.
Adding them to the module exports file⌗
nw/mod.rs
, our module exports file, will look like this:
pub mod receiver;
pub mod transmitter;
We are essentially telling anyone who scans this file that there are two public
modules, receiver
and transmitter
, available.
Importing the module⌗
Importing the module in main is straightforward, all you need to do is add the following line:
mod nw;
This tells the compiler that there is a module nw
that we want to use. Now all
we need to do to initialize our receiver is:
let receiver = nw::receiver::Receiver::new(addr);
We can then spawn our transmitter as follows:
tokio::spawn(async move {
// Sleep first to let the server start up
let two_s = time::Duration::from_secs(2);
thread::sleep(two_s);
let data = String::from_str("Talking to myself").expect(
"Could not parse string").into_bytes();
let transmitter = nw::transmitter::Transmitter::new(
target_addr, data);
transmitter.transmit().await;
});
And finally, we can listen on our receiver as follows:
receiver.listen().await;
Compile and run it, and you should see no changes in your functionality, but
now your main()
function looks a lot cleaner, and it only contains code for
functionality specific to it.