PostgreSQL is an object-relational database system that enables developers to create and manage and create a relational database instance. With PostgreSQL, you can store, manipulate, and retrieve data and support different functions of SQL. It’s used to store data for many mobile, web, infrastructure, and analytics applications. PostgreSQL offers an advanced usability for processes and has full support for foreign keys, properties, joins, views, triggers, memory events, and stored procedures in many languages.
In this guide, we’ll look at PostgreSQL features and architecture
Features of PostgreSQL
- Multi-version concurrency control: This allows users to concurrently read and write tables, blocking for only concurrent updates of the same row. Multi-version concurrency control also helps to avoid clashes within your server properties.
- Scalability: PostgreSQL supports Unicode, international character sets, and multi-byte character encoding, which can be used for sorting, case sensitivity, and formatting. PostgreSQL is highly scalable and used for the number of concurrent users and the quantity of data it can manage.
PostgreSQL is cross-platform and can run on many operating systems, including Linux, Microsoft Windows, OSX, FreeBSD, and Solaris.
- Dynamic Loading: The PostgreSQL server can also include user-written code into itself via dynamic loading. With dynamic loading, the user can specify an object code file for example, a shared library that implements a new function or type, and PostgreSQL will load it as required. The ability to modify its operation on the fly makes it uniquely suited for implementing new storage structures and applications rapidly.
- ACID Compliance: PostgreSQL is fully ACID-compliant (Atomicity, Consistency, Isolation, Durability), ensuring data integrity and consistency even in the event of system crashes or power failures.
- Extensibility: PostgreSQL is highly extensible and allows for custom functions, operators, and aggregate functions. It supports a variety of procedural languages such as PL/pgSQL, PL/Python, PL/Perl, and even PL/Java.
- Rich Data Types: Along with standard numeric, text, and date/time data types, PostgreSQL also supports a range of advanced data types such as arrays, hstore (for key-value pairs), JSON/JSONB (for unstructured data), and PostGIS for geographic objects.
- Replication and Partitioning:PostgreSQL includes built-in binary replication (both synchronous and asynchronous) for standbys and read replicas. Table partitioning is also built into PostgreSQL for large databases, allowing for faster query execution on large datasets.
Architecture of PostgreSQL
PostgreSQL is an object-relational database system that extends the SQL language combined with many features to efficiently store and scale complicated data workloads. It supports both SQL (relational) and JSON (non-relational) querying, making it a versatile choice for a variety of applications.
PostgreSQL's architecture is primarily made up of three major components: the process architecture, memory architecture, and storage architecture.
PostgreSQL uses a multi-process architecture, where separate processes are created for each client connection. This differs from the thread-based model where a single process handles multiple connections.
- Client/Server Model: PostgreSQL uses a client-server model where the database runs as a server, and clients connect to this server to perform database operations. The PostgreSQL server can handle multiple client connections simultaneously.
- Backend Processes: When a client connects to the PostgreSQL server, a new process called a "backend" process is created. This process is dedicated to managing the client's requests.
- Master Process: The master process (postmaster) is responsible for managing backend processes. It coordinates the initiation of new backend processes and handles connections and disconnections.
The memory structure of PostgreSQL is divided into shared memory and process memory.
Shared Memory: This is a common memory that's accessible by all backend processes. It is mainly used to store the buffer cache, lock tables, and other server-wide data. The PostgreSQL server has a simple structure consisting of shared memory, background processes, and a data directory structure. In the database scenario, a request is sent by the client to the server. Then, the PostgreSQL server processes the data using shared buffers and background processes. Shared buffers server to minimize the server disk IO.
Process Memory: Each backend process in PostgreSQL has its own private memory called process memory, also known as local memory. It holds data and structures that only the owning process needs to access, such as sort space, caches, and connection state. The memory space is used for bitmap operations, sorting, merging joins, and hash joins to write data into temporary files. The maintenance work memory is used for database operations, such as analyzing, vacuuming, altering tables, and creating indexes. During this process, a checkpoint occurs, and the buffer is written on the file. The
checkpointer writes all pages from memory to disk and cleans the shared buffer area, and if your database crashes, data loss can be obtained by obtaining the difference between the last checkpoint time and PostgreSQL stop time.
In configuring your PostgreSQL DB, it’s great to set the value of the shared buffer to a certain percentage, like 27% of the total memory.
- WAL Buffers: Write-Ahead Logging (WAL) is a key component of PostgreSQL's architecture, sitting at the intersection of its memory and storage architectures. The role of the WAL is to record changes to the database to ensure data integrity and assist in recovery in case of a crash.In PostgreSQL, before any change (insert, update, delete, etc.) is made to the database, a record of the change is written to the WAL. Only after this record is safely written to disk is the change allowed to be applied to the actual database. This mechanism ensures that even if a crash occurs in the middle of a transaction, the database can be recovered to a consistent state by
replayingthe changes from the WAL.
The WAL buffers come into play as part of this process. When a change is made, it's first written to a WAL buffer in memory. These buffers are then flushed to disk as needed - either when a buffer becomes full, when a transaction is committed, or as part of a periodic checkpoint operation.
WAL buffers temporarily store changes to the database. The WAL file consists of contents written by the WAL buffer at the predetermined point in time. WAL buffers and WAL files are significant to recover your data during backup and recovery.
The PostgreSQL storage system organizes data into tables, databases, and schemas.
- Tablespaces: In PostgreSQL, a tablespace is a location on disk where PostgreSQL stores data files containing database objects. It allows administrators to control disk layout.
- Database: A database in PostgreSQL is a collection of schemas and serves as a container for all your tables, indices, views, sequences, etc.
- Schemas: A schema is a named collection of database objects, including tables, views, indexes, data types, functions, and operators.
What is a Public Schema and Public Role
A user in PostgreSQL creates the role and login permissions for your server processes. (Create user = create role + login permission) The difference between the user and the role is the user can log in, but the role will not be able to log in. In PostgreSQL, creating a user is also the same as creating roles with login permissions, and by default, all the new users and roles inherit the permissions from the public role.
Whenever you create a new database, PostgreSQL, by default, creates a schema named public, which means a user who has access to your database will be able to select or create the objects and properties in that particular schema. When you add a new user and create a new role by default, they are granted the public role and can create an object in the public schema.
Getting started with PostgreSQL Server
pgAdmin is the most popular open-source management and development platform for PostgreSQL databases. It provides a graphical interface that makes it easier for beginners and experienced users alike to create, maintain, and use PostgreSQL databases.
Cross-Platform Compatibility: pgAdmin is a cross-platform tool, meaning it can be used on Windows, Mac OS X, and Linux. This makes it a flexible option for diverse development environments.
Comprehensive Functionality: With pgAdmin, you can manage almost all PostgreSQL features, such as managing databases, users, roles, tables, views, schemas, sequences, etc. It can handle complex SQL queries, has a powerful SQL editor, and supports data export and import.
Visual Query Builder: pgAdmin's visual query builder allows users to create complex queries without needing to write SQL code. This is especially useful for users who are not familiar with SQL syntax.
Support for Multiple PostgreSQL Versions: pgAdmin is built to manage a wide range of PostgreSQL versions, all the way back to PostgreSQL 9.2. This makes it a versatile tool for environments where different PostgreSQL versions are in use.
- In this guide, I'll use the pgAdmin tool to create database tables, connect to my server, and configure processes.
- Enter the name of your server, and specify the group you want your server to belong to.
- Configure your connections; in your connection tab, the hostname is the address of your server machine where your PostgreSQL server is running. We’ll be using
localhostfor the hostname and address. The local host is a reserved keyword that will be translated into the IP address, and the default IP address is
127.0.0.1and the default port number for the PostgreSQL server is
- The maintenance database is the database name you want to connect to. We’ll use postgres as the maintenance database; the default database PostgreSQL server creates for us during the installation process, which is also the default username.
- Kerberos authentication is a single sign-on or SSO network authentication protocol that allows an authorized user to access a server using a ticketing scheme. Kerberos contains three components such as the user, also known as the client; the resource, which is the file server the client wants to access; and a third party called the key distribution center that has the authentication server.
- When you’re done registering your server, click on Save.
- Next, you can log in and create your tables and schemas in your database.
PostgreSQL is easy to set up, operate, and scale your workloads. PostgreSQL gives you a managed database service for business applications that require customization of your entire underlying application workloads and operating system. With PostgreSQL services, you can install agents, patches, or custom workloads on the host with granular control capabilities.