Due to the magnitude of this entry, it has been splitted in three parts.
Since this entries are too long, it takes ~20 seconds to load the article. This is a personal website with a low price host. Be patient plz 🙁
A website about programming
Due to the magnitude of this entry, it has been splitted in three parts.
Since this entries are too long, it takes ~20 seconds to load the article. This is a personal website with a low price host. Be patient plz 🙁
When working with data values it is not arbitrary the way we align those values. At first glance we may think that we should just store one next to the other no matter what. This is often true but not always, in this article I am going to explain different ways we have in Rust to explicitly define how we want our structures to be aligned.
The Rust compiler does not guarantee to generate the same layout alignment every compilation. What that means is that our structures of data may be different everytime we change the code. Sometimes this could be good (obviously there is a reason why Rust does that) but sometimes could tend to performance issues we do not want. Is always a fight between two properties: Size vs Speed.
This is the main reason to choose one or another layout representation. Sometimes we are struggling with low capacity microcontrollers and we need this boost of space optimization to store our structures. In this kind of situations is where it is better to just represent our data structures layout as sequencials as possible.
On the other hand, other times we face optimization issues where our code is not fast enought to perform the task we need. The cache and the way we access to our memory is the key to have a well organized data layout.
We are not going to discuss how cache and memory access works in computers but you can find more about this topic in the following links:
With the attribute #[repr]
we can specify the way we want to represent our values in the layout. If no attribute value is presented, Rust will use its own representation that does not guarantee any kind of data layout.
The #[repr(C)]
attribute lets us represent our data structures the way C does it. This way let Rust to interoperate with C. Representation in Rust with the C way has some defined properties:
struct
called T
with two values; bool
and u32
. The structure alignment would be 4 bytes
(based on u32
).Pseudocode:
/// Returns the amount of padding needed after `offset` to ensure that the
/// following address will be aligned to `alignment`.
fn padding_needed_for(offset: usize, alignment: usize) -> usize {
let misalignment = offset % alignment;
if misalignment > 0 {
// round up to next multiple of `alignment`
alignment - misalignment
} else {
// already a multiple of `alignment`
0
}
}
struct.alignment = struct.fields().map(|field| field.alignment).max();
let current_offset = 0;
for field in struct.fields_in_declaration_order() {
// Increase the current offset so that it's a multiple of the alignment
// of this field. For the first field, this will always be zero.
// The skipped bytes are called padding bytes.
current_offset += padding_needed_for(current_offset, field.alignment);
struct[field].offset = current_offset;
current_offset += field.size;
}
struct.size = current_offset + padding_needed_for(current_offset, struct.alignment);
Simple and small example extracted from allaboutcircuits:
#[repr(C)]
struct T {
c: u32,
d: bool,
}
The #[repr(C)]
for unions types will be structured with size of the maximum value in the union rounded with the alignment, and an alignment of the maximum alignment of all of its fields.
#[repr(C)]
union MyUnion {
f1: u64,
f2: [u32; 8],
}
fn main() {
assert_eq!(std::mem::size_of::<MyUnion>(), std::mem::size_of::<u32>() * 8);
assert_eq!(std::mem::align_of::<MyUnion>(), std::mem::size_of::<u64>());
}
The representation of an enum
with fields is a struct
with two fields, also called a “tagged union” in C:
// This Enum has the same representation as ...
#[repr(C)]
enum MyEnum {
A(u32),
B(f32, u64),
}
// ... this struct.
#[repr(C)]
struct MyEnumRepr {
tag: MyEnumDiscriminant,
payload: MyEnumFields,
}
// This is the discriminant enum.
#[repr(C)]
enum MyEnumDiscriminant { A, B, C, D }
// This is the variant union.
#[repr(C)]
union MyEnumFields {
A: MyAFields,
B: MyBFields,
}
#[repr(C)]
#[derive(Copy, Clone)]
struct MyAFields(u32);
#[repr(C)]
#[derive(Copy, Clone)]
struct MyBFields(f32, u64);
fn main() {
assert_eq!(std::mem::size_of::<MyEnum>(), std::mem::size_of::<MyEnumRepr>());
}
Hello rustaceans! The version 1.1.0 for Easy_GA crate is finally out! As always you can find the repository in Github and the crate in crates.io
The full CHANGELOG is avaliable in CHANGELOG.md
cargo bench
. The benches are all inside benches/
.logger
feature to get information about the execution.logger::VerbosityLevel
to filter the output.logger::VerbosityType
to choose between only LOG, SAVE and LOG_AND_SAVE.The logger is a very usefull tool to measure and retrieve some data from the execution. By default the logger is disabled, you can enable it this way:
use easy_ga::VerbosityLevel; // Verbosity level {DISABLED, LOW, MID, HIGH}
use easy_ga::VerbosityType; // Verbosity type {LOG, SAVE, LOG_AND_SAVE}
use easy_ga::LOG_verbosity; // Sets the verbosity level.
use easy_ga::LOG_verbosity_type; // Sets the verbosity type.
LOG_verbosity(VerbosityLevel::LOW); // VerbosityLevel::DISABLED by default
LOG_verbosity_type(VerbosityType::LOG_AND_SAVE); // VerbosityType::LOG by default
Benchmarking was added in the version 1.1.0
and you can run them donwloading the repository and running cargo bench
from the command-line. The benchmarks are placed inside the benches/ folder.
Crates: https://crates.io/crates/easy_ga
Github repository: https://github.com/RubenRubioM/easy_ga
Easy_GA is a genetic algorithm library made for Rust projects. It provides full customization for your own genotypes definitions and a genetic algorithm implementation to wrap all the common logic within a genetic algorithm.
trait Gene
: Definition to implement for your custom genotypes.trait Selection
: Definition for your custom selection algorithms.Roulette
: Selection algorithm already implemented.Tournament
: Selection algorithm implementation with n
members on it.GeneticAlgorithm
: The main class to wrap the business logic in the genetic algorithm execution.In your Cargo.tml
you have to add the Easy_GA
dependency
[dependencies]
easy_ga = "*"
Now I will show you a basic example of Easy_GA
that you can find on main.rs
Files to include in order to use features:
use easy_ga::Gene; // For defining our own gene.
use easy_ga::GeneticAlgorithm; // To create a GeneticAlgorithm.
use easy_ga::SelectionAlgorithms; // To specity a concrete SelectionAlgorithm.
Definition of a custom Gene implementing easy_ga::Gene
trait:
#[derive(Clone, Copy)]
struct MyGene {
// Fields.
fitness: f64 // Recomended to avoid recalculate fitness on `get_fitness`
}
impl Gene for MyGene {
fn init() -> Self {
// Gene constructor.
}
fn calculate_fitness(&mut self) -> f64 {
// Fitness function.
}
fn crossover(&self, other: &Self) -> Self {
// Crossover implementation.
}
fn mutate(&mut self) {
// Mutation implementation.
}
fn get_fitness(&self) -> f64 {
// Returns the fitness
}
}
At this moment, we need to implement the Clone
& Copy
traits for our Gene
. I will try to avoid that in a future versions.
Initialization of our GeneticAlgorithm
:
let genetic_algorithm = GeneticAlgorithm::<MyGene>::new()
.population_size(20)
.iterations(50)
.mutation_rate(0.10)
.selection_rate(0.90)
.selection_algorithm(Box::new(SelectionAlgorithms::Tournament(10)))
.fitness_goal(100.0)
.init().unwrap();
We have other ways to initializate our GeneticAlgorithm
such as GeneticAlgorithm::new_with_values
if we don’t want the chain calling method.
Now that we have defined our genotype and have initializate our GeneticAlgorhtm
we have 2 ways of running it:
GeneticAlgorithm::run
: This method runs the algorithm until the end and returns a tuple with (Gene
, StopCriteria
) that represents the best Gene
in the execution and the reason to stop the execution.
let (gene, stop_criteria) = genetic_algorithm.run();
while genetic_algorithm.is_running() {
let new_generation: &Vec<MyGene> = genetic_algorithm.next_iteration();
}
The logger is a very usefull tool to measure and retrieve some data from the execution. By default the logger is disabled, you can enable it this way:
use easy_ga::VerbosityLevel; // Verbosity level {DISABLED, LOW, MID, HIGH}
use easy_ga::VerbosityType; // Verbosity type {LOG, SAVE, LOG_AND_SAVE}
use easy_ga::LOG_verbosity; // Sets the verbosity level.
use easy_ga::LOG_verbosity_type; // Sets the verbosity type.
LOG_verbosity(VerbosityLevel::LOW); // VerbosityLevel::DISABLED by default
LOG_verbosity_type(VerbosityType::LOG_AND_SAVE); // VerbosityType::LOG by default
Benchmarking was added in the version 1.1.0
and you can run them donwloading the repository and running cargo bench
from the command-line. The benchmarks are placed inside the benches/ folder.
This is a personal side project mainly for me so any further implementations will be done in my spare time as a good way to teach me more about Rust.
Selection
algorithmsEasy_GA is licensed under Mozilla Public License 2.0.
In some languages it is common to have conversions between different types such as converting an integer to a floating point, for example. There are two types of conversions in programming; implicit conversions and explicit conversions.
An explicit conversions, as the name suggest, it is when the programmer have to explicitly specify what conversion has to be done. This way the code may look more messy but the programmer has the control of the code in all time.
void floatToInt(int x) {
std::cout << x; // prints 42
}
int main() {
float x = 42.51231;
floatToInt(static_cast<int>(x));
}
Implicit conversions are a powerfull tool that some languages provide usually to make the code easier to read and more agile when coding. Implicit conversions as the name suggest are done behind the scenes, wich means that the language itself knows that one type has to be converted to another type. In the code snippet below we can see an example:
void floatToInt(int x) {
std::cout << x; // prints 42
}
int main() {
float x = 42.51231;
floatToInt(x);
}
In this case, we can see how easily is to lose information in some data types if you rely in implicit conversion too much. This is te reason some modern languages like Rust doesn’t allow implicit conversions.
explicit
specifier is a keyword that C++ provides and can be used in constructors and conversion functions to avoid undisered implicit conversions.
struct X {
X(int) {}
operator std::string() const { return std::string();}
};
struct Y {
explicit Y(int) {}
explicit operator std::string() const {return std::string();}
};
int main() {
X x1 = 10;
X x2(10);
X x3 = {10};
std::string s1 = x1;
std::string s2 = static_cast<std::string>(x1);
Y y1 = 10; // Error: cannot convert int to Y
Y y2(10);
Y y3 = {10}; // Error: cannot convert initializer-list to Y
std::string s3 = y2; // Error: Implicit conversion not implemented.
std::string s4 = static_cast<std::string>(y2);
}
In an effort for C++ to prevent issues with implicit conversion we have the ‘uniform initialization’ or ‘braced initialization’ in C++ 11 with the operator{}
. This operator forze us to expecify the exact type our constructor is expecting.
struct Z {
Z(int, bool) {}
};
int main() {
int x1(10.5); // Implicit conversion from double to int -> 10
int x2{10.5}; // Error: narrowing conversion from double to int.
Z z1(10.5, -1);
Z z2{10, -1}; // Error: narrowing conversion int to bool.
Z z3{10, false};
}
But since ‘braced initialization’ only applies when constructing a type or an object, if we want a specific function to only accept the type we are indicating, the solution is a little bit more tricky.
struct Z {
Z() = default;
void Foo(int) {}
void Foo(float) = delete;
void Foo(bool) = delete;
};
int main() {
Z z1;
z1.Foo(1);
z1.Foo(1.5); // Error: use of deleted function
z1.Foo(true); // Error: use of deleted function
}
We can use generic parametrization to ensure that only functions we declare are going to be called.
struct Z {
Z() = default;
void Foo(int) {}
template<typename T>
void Foo(T) = delete;
};
int main() {
Z z1;
z1.Foo(1);
z1.Foo(true); // Error: use of deleted function
z1.Foo(1.5); // Error: use of deleted function
z1.Foo('a'); // Error: use of deleted function
}
In C++ 20 concepts were added and it is the proper way to address this problem. Let’s see an example:
template<typename T>
requires std::same_as<T, int>
void Foo(T) {}
But we can do it in a shorter way using auto
.
concept is_integer = std::same_as<T, int>;
void Foo(is_integer auto) {}
C++ is one of those languages that give us tools to play with almos every aspect of an implementation. Usually, programming languages with type conversions have some conversiones already defined by the standard. In this case, C++ allow us to define our owns conversions. In the next code snippet we can see an explicit and implicit conversion between two custom types. Both structs recieve 2 std::strings&
and 1 int
and implement custom casting to std::string
.
struct FullNameExplicit {
explicit FullNameExplicit(const std::string& name, const std::string& second_name, int age) :
name(name),
second_name(second_name),
age(age) {}
explicit operator std::string() const {
return name + ' ' + second_name + " has " + std::to_string(age) + '\n';
}
std::string name;
std::string second_name;
int age;
};
struct FullNameImplicit {
FullNameImplicit(const std::string& name, const std::string& second_name, int age) :
name(name),
second_name(second_name),
age(age) {}
operator std::string() const {
return name + ' ' + second_name + " has " + std::to_string(age) + '\n';
}
std::string name;
std::string second_name;
int age;
};
void Foo(const std::string person) {
std::cout << person;
}
int main() {
FullNameExplicit fne("Ruben", "Rubio", 24);
Foo(fne); // Error: implicit conversion not defined.
Foo(static_cast<std::string>(fne));
FullNameImplicit fni("Ruben", "Rubio", 24);
Foo(fni);
Foo(static_cast<std::string>(fni));
}
In my humild opinion, I think we should avoid implicit conversion as much as possible. Particularly, when working with other people involved in the code they may not be aware of the implicit conversions you are using and this can tend to get lost easily in a large project.
ADL also known as argument-dependent lookup or Koenig lookup (even thought he said that he have not discovered it), is a very old and unfamiliar C++ feature.
ADL work is to set a couple of rules to lookup for functions based on their namespaces parameters provided, e.g:
int main() {
std::vector<int> v;
auto it = std::begin(v); // Common std::begin function.
auto itAdl = begin(v); // Also calling std::begin because 'v' is in the std namespace.
}
This is a common piece of code in C++ that everyone has written without knowing why exactly works. In this case both ways are equivalent because they are calling std::begin
. The first one is called directly and the second one is found thanks to ADL set of rules.
This is very powerfull specially when working with operators like the example below:
int main() {
const std::string& out = std::string("Hello") + "world";
std::cout << out;
}
There is neither a +
or <<
global operator but thanks to ADL it is smart enought to look for them in their relative namespaces. Let’s take a look what are the real code generated by the compiler using https://cppinsights.io/
int main() {
const std::basic_string<char, std::char_traits<char>, std::allocator<char> > & out = std::operator+(std::basic_string<char, std::char_traits<char>, std::allocator<char> >(std::basic_string<char, std::char_traits<char>, std::allocator<char> >("Hello", std::allocator<char>())), "world");
std::operator<<(std::cout, out);
}
Both operators have been converted to std::operator
calls.
Another very cool feature is to prioritize our own function calls when overriding some std implementations:
template<typename T>
int accumulate(T begin, T end, int sum) {
return 42;
}
int main() {
std::vector<int> values{10,30,10};
int sum{0};
accumulate(values.begin(), values.end(), sum); // Returns 42.
}
In this case if our own accumulate implementation have not been provided, std::accumulate
would be called instead.
Obviously we can take advantage of this an use it in our own code. Even thougt I personally think it is a better practice to always specify the namespace on our functions and objects, it could be a very powerfull solution if at some time we want to change a large piece of code just by swapping namespaces.
namespace ns1 {
struct X{};
void g(const X x) {}
void f(const X x) {}
}
namespace ns2 {
struct X{};
void g(const X x) {}
void f(const X x) {}
}
int main() {
ns1::X x;
g(x); // ns1::g(ns1::X) is called.
f(x); // ns1::f(ns1::X) is called.
}
In this particular example if we change the namespace in the object declaration, all the function calls would also change.
You might think that this is a stupid debate, and maybe you are right but let me take a moment to explain to you if we should care whether to return true or false.
True and false are the basis in computer programming and traditional logic. The binary system is used based on this only two possibilities, the value is either one or zero.
To understand if returning true or false has an impact in our code performance we have to talk about how computers work in a very low level, in specific I’m going to talk a bit about x86-64 assembly and different C++ compilers.
Once we compile our code in C++ the compilers convert the code into assembly code. In assembly we usually talk about number of instructions instead of number of lines. And even different instructions can take different number of cycles to be processed in our CPU. Okey, let’s see how return true and return false is compiled into assembly code using GCC 11.2 and Clang 12.0.1. Both without any optimizations on.
/*
push rbp
mov rbp, rsp
xor eax, eax
and al, 1
movzx eax, al
pop rbp
ret
*/
bool ReturnFalse() {
return false;
}
/*
push rbp
mov rbp, rsp
mov al, 1
and al, 1
movzx eax, al
pop rbp
ret
*/
bool ReturnTrue(){
return true;
}
/*
push rbp
mov rbp, rsp
mov eax, 0
pop rbp
ret
*/
bool ReturnFalse() {
return false;
}
/*
push rbp
mov rbp, rsp
mov eax, 1
pop rbp
ret
*/
bool ReturnTrue(){
return true;
}
As we can see Clang takes 4 more instructions than GCC to do the same piece of code. Now let’s take a look what happens when we turn optimizations on (-O3) in both compilers.
/*
xor eax, eax
ret
*/
bool ReturnFalse() {
return false;
}
/*
mov al, 1
ret
*/
bool ReturnTrue(){
return true;
}
/*
xor eax, eax
ret
*/
bool ReturnFalse() {
return false;
}
/*
mov eax, 1
ret
*/
bool ReturnTrue(){
return true;
}
Now both compilers are able to perform both tasks in only 2 instructions. But as I mention before all instructions are not the same, let’s take a moment to analize this in machine code.
The instruction mov definitions is: “Copies the second operand (source operand) to the first operand (destination operand)”. Wich means that we are copying the right value to the left register. And its translation to machine code is:
mov eax, 1 # b8 01 00 00 00
mov al, 1 # b0 01
Why are the machine code different if both instructions perform the same operation? This is because copying a value into a register depends on the register. In this case eax is a 32 bits register meanwhile al is an 8 bit subset of eax register.
On the other hand, the xor definition says: “Performs a bitwise exclusive OR (XOR) operation on the destination (first) and source (second) operands and stores the result in the destination operand location“. A classic XOR logic operation which has the advantage that if done with the register itself it will always become 0. The CPU is extremely optimized to perform this kind of bitwise operations.
Returning 0 (false) seems to be better than returning 1 (true) but the difference it is almost indistinguishable. Even thought it is always a good practice to analize your problem and try to optimize it as much as you can, e.g: You can not always return false instead of true, what you can is to study the percentage of times your result will be true or false. This is a common practice when developing in assembly code specially when you have to deal with some if-else
and you really want to just ride of the evaluations as fast as possible.
Shinobu is a videogame made for Amstrad CPC using Z80 assembly and with the CPCtelera library. This project was highly related to optimize every piece of memory due to the Amstrad CPC limitation of 16KB of memory RAM.
The source code can be found in the github repository and the game can be played not only in a physical Amstrad CPC but in a web browser.
If you know a bit of modern C++ you will have use the std::vector<T>
container a lot of times without even wondering what type you specify to instantiate the container. This is fine for every case except for on specific one, std::vector<bool>
.
If you go to the std::vector specification you will find that there is only one specification which is std::vector<bool>
. That’s because the bool type specification is implemented outside the standard std::vector<T>
container. The reason for this could be pretty obvious and logic for you but it could also be a headache for others who try to use it.
As you may know one bool has the size of 1 byte in the C++ standard. This may make us think why we should use 8 bits when we can know if a bool is true or false just with 1 bit of information, if it’s either 1 = true or 0 = false
. Well, that’s exactly what the C++ commitee thought back in the days and decided to use only 1 bit to represent the value in the container, in this way the can store up to 8 values just in 1 byte of memory which means 8 more values than using a standard boolean.
And with that decision is when the problems come. As you may know, a reference is just a memory adress to the beginning of a specific data, which means that you can point to the start of something and increase this position byte by byte.
#include <vector>
#include <iostream>
int main() {
std::vector<int> vector_int{1, 2};
std::cout << &vector_int[0]; // 0x1e06eb0
std::cout << &vector_int[1]; // 0x1e06eb4
}
As you can see the offset between the first and the second integer is 4 bytes because that is the size of an int in C++ standard. Now lets try the same with std::vector<bool>
.
#include <vector>
#include <iostream>
int main() {
std::vector<bool> vector_bool{true, false};
std::cout << &vector_bool[0]; // error: taking address of rvalue.
}
The error the compiler give us can be easily understandable by checking out std::vector<bool> specification.
And for the same reason you can’t take values by reference using base-ranged loops which are implemented based on begin() and end().
#include <vector>
#include <iostream>
int main() {
std::vector<bool> vector_bool{false, true};
std::vector<int> vector_int{1, 2, 3};
for (int& i : vector_int) {
// OK.
}
for (bool& b : vector_bool) {
// error: cannot bind non-const lvalue reference of type 'bool&' to an rvalue of type 'bool'
}
}
Finally, to show the final evidence about the difference between std::vector<T>
and std::vector<bool>
we can see their sizes in bytes.
int main() {
std::cout << sizeof(std::vector<int>); // 24 bytes.
std::cout << sizeof(std::vector<bool>); // 40 bytes.
}
Twitch Loot Auto-Clicker is a browser extension to improve the experience while using twitch.tv. This extensions find where do you have points avaliable to collect in a twitch channel and auto click them for you to maximize your rewards.
Its second utility is to bring to you an extense and detailed summary about the points earned while using the extension.
The extension has over 700 active daily users and growing every day with users all arround the globe.
The source code is completly open in the github repository and the extension can be downloaded in the chrome app store
Technologies: