A Bit About Database Management Systems

Databases are logically modelled clusters of information, or data. A database management system (DBMS), on the other hand, is a computer program that interacts with a database. A DBMS allows you to control access to a database, write data, run queries, and perform any other tasks related to database management.

Although database management systems are often referred to as “databases,” the two terms are not interchangeable. A database can be any collection of data, not just one stored on a computer, while a DBMS is the software that allows you to interact with a database.

Relational

The relational data model, which organizes data in tables of rows and columns, predominates in database management tools. Today there are other data models, including NoSQL and NewSQL, but relational database management systems (RDBMSs) remain dominant for storing and managing data worldwide.

SQLite

SQLite is a self-contained, file-based, and fully open-source RDBMS known for its portability, reliability, and strong performance even in low-memory environments.

The SQLite project’s website describes it as a “serverless” database. Most relational database engines are implemented as a server process in which programs communicate with the host server through an interprocess communication that relays requests. With SQLite, though, any process that accesses the database reads from and writes to the database disk file directly. This simplifies SQLite’s setup process, since it eliminates any need to configure a server process. Likewise, there’s no configuration necessary for programs that will use the SQLite database: all they need is access to the disk.

SQLite is free and open-source software, and no special license is required to use it. However, the project does offer several extensions — each for a one-time fee — that help with compression and encryption. Additionally, the project offers various commercial support packages, each for an annual fee.

When To Use SQLite

Embedded applications: SQLite is a great choice of database for applications that need portability and don’t require future expansion. Examples include single-user local applications and mobile applications or games.
Disk access replacement: In cases where an application needs to read and write files to disk directly, it can be beneficial to use SQLite for the additional functionality and simplicity that comes with using SQL.
Testing: For many applications it can be overkill to test their functionality with a DBMS that uses an additional server process. SQLite has an in-memory mode which can be used to run tests quickly without the overhead of actual database operations, making it an ideal choice for testing.

When Not To Use SQLite

Working with lots of data: SQLite can technically support a database up to 140TB in size, as long as the disk drive and filesystem also support the database’s size requirements. However, the SQLite website recommends that any database approaching 1TB be housed on a centralized client-server database, as an SQLite database of that size or larger would be difficult to manage.
High write volumes: SQLite allows only one write operation to take place at any given time, which significantly limits its throughput. If your application requires lots of write operations or multiple concurrent writers, SQLite may not be adequate for your needs.
Network access is required: Because SQLite is a serverless database, it doesn’t provide direct network access to its data. This access is built into the application, so if the data in SQLite is located on a separate machine from the application it will require a high bandwidth engine-to-disk link across the network. This is an expensive, inefficient solution, and in such cases a client-server DBMS may be a better choice.

MySQL

According to the DB-Engines Ranking, MySQL has been the most popular open-source RDBMS since the site began tracking database popularity in 2012. It is a feature-rich product that powers many of the world’s largest websites and applications, including Twitter, Facebook, Netflix, and Spotify. Getting started with MySQL is relatively straightforward, thanks in large part to its exhaustive documentation and large community of developers, as well as the abundance of MySQL-related resources online.

MySQL was designed for speed and reliability, at the expense of full adherence to standard SQL. The MySQL developers continually work towards closer adherence to standard SQL, but it still lags behind other SQL implementations. It does, however, come with various SQL modes and extensions that bring it closer to compliance. Unlike applications using SQLite, applications using a MySQL database access it through a separate daemon process. Because the server process stands between the database and other applications, it allows for greater control over who has access to the database.

When To Use MySQL

Distributed operations: MySQL’s replication support makes it a great choice for distributed database setups like primary-secondary or primary-primary architectures. Websites and web applications: MySQL powers many websites and applications across the internet. This is, in large part, thanks to how easy it is to install and set up a MySQL database, as well as its overall speed and scalability in the long run.
Expected future growth: MySQL’s replication support can help facilitate horizontal scaling. Additionally, it’s a relatively straightforward process to upgrade to a commercial MySQL product, like MySQL Cluster, which supports automatic sharding, another horizontal scaling process.

When Not To Use MySQL

SQL compliance is necessary: Since MySQL does not try to implement the full SQL standard, this tool is not completely SQL compliant. If complete or even near-complete SQL compliance is a must for your use case, you may want to use a more fully compliant DBMS.
Concurrency and large data volumes: Although MySQL generally performs well with read-heavy operations, concurrent read-writes can be problematic. If your application will have many users writing data to it at once, another RDBMS like PostgreSQL might be a better choice of database.

PostgreSQL

PostgreSQL, also known as Postgres, bills itself as “the most advanced open-source relational database in the world.” It was created with the goal of being highly extensible and standards compliant. PostgreSQL is an object-relational database, meaning that although it’s primarily a relational database it also includes features — like table inheritance and function overloading — that are more often associated with object databases.

Postgres is capable of efficiently handling multiple tasks at the same time, a characteristic known as concurrency. It achieves this without read locks thanks to its implementation of Multiversion Concurrency Control (MVCC), which ensures the atomicity, consistency, isolation, and durability of its transactions, also known as ACID compliance.

PostgreSQL supports numeric, string, and date and time data types like MySQL. In addition, it supports data types for geometric shapes, network addresses, bit strings, text searches, and JSON entries, as well as several idiosyncratic data types.

When To Use PostgreSQL

Data integrity is important: PostgreSQL has been fully ACID-compliant since 2001 and implements multiversion currency control to ensure that data remains consistent, making it a strong choice of RDBMS when data integrity is critical.
Integration with other tools: PostgreSQL is compatible with a wide array of programming languages and platforms. This means that if you ever need to migrate your database to another operating system or integrate it with a specific tool, it will likely be easier with a PostgreSQL database than with another DBMS.
Complex operations: Postgres supports query plans that can leverage multiple CPUs in order to answer queries with greater speed. This, coupled with its strong support for multiple concurrent writers, makes it a great choice for complex operations like data warehousing and online transaction processing.

When Not To Use PostgreSQL

Speed is imperative: At the expense of speed, PostgreSQL was designed with extensibility and compatibility in mind. If your project requires the fastest read operations possible, PostgreSQL may not be the best choice of DBMS.
Simple setups: Because of its large feature set and strong adherence to standard SQL, Postgres can be overkill for simple database setups. For read-heavy operations where speed is required, MySQL is typically a more practical choice.
Complex replication: Although PostgreSQL does provide strong support for replication, it’s still a relatively new feature and some configurations — like a primary-primary architecture — are only possible with extensions. Replication is a more mature feature on MySQL and many users see MySQL’s replication to be easier to implement, particularly for those who lack the requisite database and system administration experience.

NoSQL

The key difference between a NoSQL and SQL is that a SQL database is considered a relational database. A relational database stores data in tables, which are organized into columns. Each column stores one datatype (integer, real number, string, date etc.) and each row represents an instance of the table. Non-relational databases do not store data in tables- instead there are multiple ways to store data in NoSQL databases (Key-value, Document-based, Column-based).

Key-value Store

(Oracle NoSQL, Redis, Amazon Dynamo)
Key-Value databases are the most simple of all the NoSQL databases. The basic data structure is a dictionary or map. A value can be stored as an integer, string, JSON, or an array- with a key used to reference that value. For example you could have a key as a customer id, which referred to a value containing a string of the customer’s name. Using a JSON structure would add complexity to the value as it could contain a dictionary of information about the the customer with the corresponding key.

Benefits include rapid storage of data (due to the high simplicity), has integrated caching feature allowing users to store/retrieve data as quickly as possible and they are highly suited for unrelated data. For example a user’s web app activity will be different and unrelated to another user’s activity. However, key-value databases make it difficult to perform advanced queries aside from basic CRUD (create, read, update, delete) operations. Additionally, as volume of data increases maintaining unique keys becomes more difficult.

Document-Based Store

(MongoDB and Couchbase)
The next type of NoSQL database is the Document-oriented database. Data is structured in the form of documents and collections. A document can be a PDF, Microsoft word doc, XML or JSON file. As opposed to columns and datatypes a document contains key value pairs. Each document does not have to be in the same structure as other documents. Due to this, to add additional data one can simply add more documents without having to change the structure of the entire database.

Documents are grouped into collections, which serve a similar purpose to a relational table. Document databases provide a querying function to search collections of documents with particular attributes. Benefits include flexible data modeling (eliminates the need to force fit relational data models, as it can handle structured, unstructured and semi-structured data) and fast-write performance over strict consistency (great for Agile and quick iteration). Document based databases also allow for separation of collections by entity (orders and customer profiles). However, they are also limited with advanced queries and does not allow for joins.

Further reading