Object-Relational Mapping with Entity Framework

Today, the World Wide Web is the most popular service of the largest computer network, the Internet. Over 100,000 new web pages join it every day, ranging from static with minimal interaction support to dynamic ones that require a large amount of up-to-date data to perform their functions.

The photo is illustrative. (Photo: Pixabay)

Author: Denis Balant, Enej Hudobreznik

Despite the rapid development of technology, the main language for managing relational databases is still SQL or structured query language, which with many customized derivatives (PostgreSQL, MySQL...) has its roots in the 70s of the last century.

Writing SQL queries often turns out to be quite time-consuming work, especially with more complex database structures. Instead of manually writing queries and conversions to objects, typical of object-oriented programming languages, it is much more natural and often simpler to map classes from source code to data schema and not the other way around. This saves development time, and at the same time it is much easier to recognize the relationships between entities. This method is called object-relational mapping (ORM).

For the .NET environment, Microsoft offers the Entity Framework Core (EF Core for short) open source framework for this purpose, which enables data-oriented development (the Code First approach), where interconnected data is simply collected in classes that represent tables, and the framework is based on on their basis, it understands relations and creates a database schema in its own format. The biggest advantage of such an approach is the possibility of installing the data schema on the desired database without knowing SQL.

EF Core communicates with databases through plug-in libraries called database providers, so switching to another database is easy. These can be installed by the user via the package manager (NuGet for C#). Official libraries are only available for Microsoft's SQL Server and Azure Cosmos DB solutions and the SQLite project, and thanks to the strong open source community, the framework supports practically all major relational databases (MySQL, PostgreSQL...).

As an example of data modeling, let's take a look at the "Student" table, which has the attributes first name (string of up to twenty characters), last name (string of up to twenty characters) and unique identifier (integer).

In PostgreSQL, we define this table as follows:

_{Student table definition in PostgreSQL (own source)}

In addition to the definitions of the basic types for first and last name, we also require that they are never empty (NOT NULL). The SERIAL type represents a unique integer, which in our case serves as the primary key of the Id. For each added entity, it is automatically determined based on the ID of the previously added identity, which is simply incremented.

However, it is much easier to define the table as a class in the source code of the application logic using an ORM that itself creates the object and the corresponding database requirements. The Key tag above the class attribute indicates the primary key, and Required indicates that the attribute must not be empty.

After development, the EF Core framework allows us to automatically generate a data schema from the code and deploy it to the selected database via the command line tools that are included in the framework. Gradual changes to the database schema are managed through these migrations, which ensure that the database remains synchronized with the application's data model. New migrations, which are stored in EF Core in the form of special classes, are created by comparing the current data model with the current database schema (the state of the last migration).

*Student table definition using EF Core in .NET environment (own source)*

The database connection is abstracted by a class that inherits from the DbContext class. Its attributes are collections of entities of type DbSet that map to tables.

_{Database connection class and student table defined (own source)}

The aforementioned framework for writing queries uses the LINQ syntax, which represents a unified way to retrieve and process data from different sources. The query is then translated into SQL, and the result itself is translated back into an object, its attribute, or an object table.

The example below shows a query for a student with a known enrollment number. All it takes is one call to the Find method with the primary key (ID) value.

_{Student query example with EF Core (own source)}

If we want to achieve the same result without using an ORM tool, it requires significantly more code. An example using the NpgSql library is shown. First, we need to create objects that represent the SQL query and read the database, while we must be careful to include the parameter correctly to avoid possible vulnerabilities, such as e.g. SQL injection. This time we have to create the student object ourselves, but we have to be careful about the order of the attributes in the query and the possibility of an error in the query.

_{Student query example using NpgSql library (own source)}

Object-relational mapping thus offers us a layer of abstraction that significantly speeds up software development due to its mapping from an object-oriented development plan, because SQL code does not need to be written separately from the source (often object-oriented) code, a well-written ORM supports good development patterns and practices for application design, while allowing non-SQL developers to more easily integrate a relational database into their application.

However, it is important to emphasize that ORM is not a perfect solution. The weakness lies precisely in the abstraction it offers us. It generates significantly more SQL code than a developer would write, which can greatly affect application speed, can hide bad database access practices, and is a backwards compatibility issue. Although the generated code is correct in most cases, it is still recommended to check and test it manually.

Object relational mapping is a functionality that is not native to the .NET environment. Functionality is provided by most frameworks and libraries for different languages. Examples are Django for Python, Gorm for Go, Spring for Java, Prisma for JavaScript (or Node.js)...