🗃️ Databases

SQL databases

Databases block viewer

This block viewer lets you flick through all the existing blocks in the Databases folder so you can choose what parts to add to your pages and what parts you might want to create, revise, or leave out.

It's literally just an alphabetical list of whatever is in this folder.

Being More Selective

Learning Objectives

Returning all the customers is fine for a small table (even the 130+ customers) but if we have a large number of rows, thousands or millions, then that approach is very clumsy.

If we know the value of the id column for the customer we can retrieve just that one row (we know there is only one row because id is the Primary Key).

Create a new endpoint to get the customer by using the id value:

app.get("/customers/:id", function (req, res) {
  // TODO - add code here
});

Here the endpoint includes :id, which identifies an extra parameter in the URL that will provide the id value, as in http://localhost:3000/customers/3. We just need a way to get the value from the URL.

Express provides a simple way to get such parameters from the request:

const custId = parseInt(req.params.id);

Note that the id in req.params.id must match the name after the colon in the endpoint "customers/:id".

Next we need to query the customers table and provide the value of the id into a WHERE clause so that we retrieve only the one row that matches:

db.query("SELECT * FROM customers WHERE id = $1", [custId])
  .then((result) => {
    console.log(result);
  })
  .catch((err) => {
    console.log(err);
  });

In the above code notice that:

  • The SELECT now uses a WHERE clause
  • The value we are wanting to look for appears as $1 - this is a placeholder for the ‘id’ value
  • The second parameter is the array [custId] that provides the id value

Let’s complete the endpoint to return the retrieved values:

app.get("/customers/:id", function (req, res) {
  const custId = parseInt(req.params.id);
  db.query("SELECT * FROM customers WHERE id = $1", [custId])
    .then((result) => {
      console.log(result.rows);
    })
    .catch((error) => {
      console.log(error);
    });
});

In db.query you can use placeholders $1, $2, … $9, $10, $11, … etc to mark the place where a parameter value should be used. The parameters to replace the placeholders are supplied in the second argument, the array of values. In this case there is only one value (but it must still be put into an array) so we have [custId] as the replacement value for the first placeholder, $1. If there is more than one placeholder there must be the same number of array elements and they must be in the order of the placeholder numbers.

String Placeholders

With String placeholders you don’t put apostrophes around the placeholder:

app.get("/customers/by_city/:city", (req, res) => {
  const cityName = req.params.city;
  db.query("SELECT * FROM customers WHERE city LIKE $1 || '%'", [cityName])
    .then(() => {})
    .catch((err) => {});
});

Use Placeholders to Avoid SQL Injection

What is SQL Injection

If an end user can enter unrestricted text into a data field on a form there is the possibility that they could put a nasty SQL command inside the data. For example, if the field is used to enter the customer name the user could type: J Doe'; delete from customers; select 'x' from customers as the name value. It looks a bit unlikely but any user with a little bit of SQL knowledge could eventually work out the format needed.

If the code used string concatenation to for the final SQL command it could be something like:

const myQuery = `UPDATE customers SET name = '${inputName}' WHERE id =  + ${ID}`;

Then myQuery would become: UPDATE customers SET name = 'J Doe'; DELETE FROM customers; SELECT 'x' FROM customers WHERE id = 123 (Note that semicolon can be used between SQL commands in this context)

By using placeholders instead of string concatenation we can prevent this kind of attack. This is VERY important in web-facing apps that use SQL.

exercise

  1. Make sure you have defined the endpoints for:
  2. Getting all customers
  3. Getting a single customer by id
  4. Using a method similar to the one used to get a customer by id, define an endpoint that can get customers by matching part of the name (e.g. /customers/by-name/:name). Remember there may be more than one matching row!
  5. Test all these endpoints with Postman and ensure the results are as expected.

Communicating with the database using SQL

Learning Objectives

👩🏽‍✈️👨🏾‍✈️Code along with video mentors


The PSQL utility

We use SQL to perform actions on the database and initially we can use a terminal-like utility to do this. The utility is named psql and is run using the command:

psql <dbname> <username>

The command prompt from psql is the database name currently connected:

my_hotel=>

In psql, you can use the command help to show the help menu. Within the command prompt, you can enter SQL statements and run them against PostgreSQL. To quit psql, enter the command \q.

Download the following file to a directory on your computer. This file creates the sample data you can use for the following sections. To do this, click the file to open it in a github formatted page, then right click the Raw button in the bar just above the code and select Save As (or Save Link As or similar) to save it:

Once you have the file downloaded to a known directory, execute the file build-hotel.sql from psql as shown below (replace /your/sql/path/ with the path to the download directory used above):

\include /your/sql/path/build-hotel.sql

Check that you have built all the required tables:

\dt

You should see a listing of your tables as follows (with your own username as owner):

            List of relations
    Schema |     Name      | Type  | Owner
    --------+---------------+-------+-------
    public | customers     | table | keith
    public | invoices      | table | keith
    public | reservations  | table | keith
    public | room_types    | table | keith
    public | rooms         | table | keith
    (5 rows)

Here is the table diagram of the hotel database:

erDiagram room_types ||..|| rooms : "IS_A" customers ||--o{ reservations : "BOOKS" rooms ||--o{ reservations : "HAS" reservations ||--|{ invoices : "HAS" room_types { string room_type PK decimal def_rate } rooms { int room_no PK decimal rate string room_type FK int no_guests } customers { int id PK string name string email string phone string address string city string postcode string country } reservations { int id PK int cust_id FK int room_no FK date checkin_date date checkout_date int no_guests date booking_date } invoices { int id PK int res_id FK decimal total date invoice_date boolean paid }

The SELECT Statement

We are first going to look at retrieving data from the database so we can examine it and later, use it in our applications.

To get data out of a table you use the SELECT statement (or command):

SELECT ... FROM ...;

For example:

SELECT name, phone, country FROM customers;

SQL commands entered in the psql command line tool are terminated with a semicolon (;). The SQL command can extend across several lines, but each keyword, name or value cannot be split over more than one line. For example:

SELECT name,
       phone,
       country
  FROM
       customers;

is the same as the previous example.

You can use SELECT * FROM ... to return all the columns of the table. For example:

SELECT * FROM rooms;

This is also a useful command to see what columns exist in a table. You can also use the \d <table_name> psql command to describe the table.

Note that the use of UPPER/lower case is only to emphasise and differentiate the SQL keywords (upper case) from the other names (lower case) e.g. column and table names. SQL keywords are not case-sensitive.

Scrolling the Results

When you issue a SELECT that returns a lot of data psql displays it one screenful at a time. To get the next screenful just press the Space key. You can also use the Up and Down arrow keys for line-by-line control and you can go back one screen with the ‘B’ key.

When you have finished reading the output use the ‘Q’ key to quit the display manager.

    Space       Next screenful
    'B'         Previous screenful
    Down Arrow  Next line
    Up Arrow    Previous line
    'Q'         Quit back to prompt

activity

  1. List the name, phone and email of all customers
  2. List all the details of rooms
  3. List the customer id, checkin date and number of guests from reservations
git

Some Useful psql Commands

The psql commands are not SQL and are specific to PostgreSQL (although most other RDBMS’s have commands to perform similar jobs). These commands let you display information, execute system commands, etc. Use \? to display a summary of all the psql commands.

Display a list of available tables in the database:

\dt

Display the definition of a table:

\d <table name>

Display help for SQL commands:

\h [command]

Display a summary of the psql (backslash) commands:

\?

Exit (quit) from psql:

\q

Note that psql commands ARE case sensitive, unlike SQL commands.

activity

  1. Display the definition of the customers table
  2. Display the help for the SELECT command (Note: we will not be covering ALL of this syntax!)
  3. Read the psql command help and find out what \dS does then try it

Displaying More Than Just Columns

You can use expressions in SQL:

SELECT room_no, rate * 0.85 FROM rooms;
+---------+-------------+
| room_no | rate * 0.85 |
+---------+-------------+
|     101 |     72.2500 |
|     102 |     72.2500 |
|     103 |     72.2500 |
  ...

Use a column alias to give the expression a meaningful name:

SELECT room_no,
       rate * 0.85 AS discounted_rate
    FROM rooms;
+---------+-----------------+
| room_no | discounted_rate |
+---------+-----------------+
|     101 |         72.2500 |
|     102 |         72.2500 |
|     103 |         72.2500 |

Here, the query uses the alias as the column heading. Aliases can also be used in other contexts - more on this later…


Expressions in SQL

As with Javascript you can use a variety of ‘operators’ to define expressions in SQL.

Arithmetic:

*   Multiply

/   Divide

+   Add

-   Subtract

%   Modulo (remainder)

(...) Parentheses (to override precedence)

String:

||  Concatenation

For example, to display the weekly rate for a room (with 10% weekly discount):

SELECT room_no, room_type, rate * 7 * 0.90 from rooms;

You can change the column heading using a column alias:

SELECT room_no, room_type, rate * 7 * 0.90 as weekly_rate from rooms;

Use string concatenation to glue character data together:

SELECT 'Customer name = ' || name FROM customers;

Choosing the Rows

You can choose which rows to display by specifying some condition that must be matched:

SELECT id, name, phone, email, country
  FROM customers
  WHERE country = 'France';

 id  |        name        |      phone       |            email            | country
-----+--------------------+------------------+-----------------------------+---------
 9   | Laurence Lebihan   | 91.24.4555       | laurence.lebihan@xmzx.net   | France
 12  | Carine Schmitt     | 40.32.2555       | carine.schmitt@dftu.net     | France
 15  | Janine Labrune     | 40.67.8555       | janine.labrune@dlsh.net     | France
 25  | Mary Saveley       | 78.32.5555       | mary.saveley@yppl.net       | France
 34  | Martine Rancé      | 20.16.1555       | martine.rancé@xeqs.net      | France
 35  | Marie Bertrand     | (1) 42.34.2555   | marie.bertrand@glut.net     | France
 49  | Frédérique Citeaux | 88.60.1555       | frédérique.citeaux@vekn.net | France
 59  | Annette Roulet     | 61.77.6555       | annette.roulet@lgha.net     | France
 62  | Daniel Da Silva    | +33 1 46 62 7555 | daniel.da.silva@hijy.net    | France
 63  | Daniel Tonini      | 30.59.8555       | daniel.tonini@mxvw.net      | France
 91  | Laurence Lebihan   | 91.24.4555       | laurence.lebihan@xmzx.net   | France
 92  | Paul Henriot       | 26.47.1555       | paul.henriot@uwua.net       | France
 106 | Dominique Perrier  | (1) 47.55.6555   | dominique.perrier@bdim.net  | France
(13 rows)

You can use comparison operators =, <, >, <=, >=, != (or <>)

Note: use only one = (equals) symbol to test for equality

When comparing numbers no punctuation is needed around the value, for example, WHERE rate > 100.

When comparing character data or dates you must enclose the values in single quotes (apostrophes), for example, WHERE name = 'Mary Saveley'.

Only the rows that match the comparison test (called a predicate) are returned by the query. The predicate can use columns not returned by the query,

Combining Tests in a Predicate

Use AND and OR to combine tests:

SELECT * FROM reservations
   WHERE room_no >= 200
     AND room_no < 300
     AND checkin_date >= '2018-01-01';

This lists reservations for rooms on the second floor (rooms 200 - 299) since the start of 2018. Note the format of the date value - this conforms to the ISO 8601 standard and should be used in preference to any other format to avoid ambiguity.

Another example - to find cheap or Premier rooms on floors 1 and 2 - we might try this to start with:

SELECT * FROM rooms
   WHERE room_type = 'PREMIER'
      OR rate < 100.00
     AND room_no < 300;

This isn’t quite right - it returns rooms on the 3rd and 4th floors. Why?

Overriding Evaluation Order

Just like any programming language, SQL has an evaluation order (precedence). For example, multiply and divide take precedence over add and subtract, so that:

SELECT rate + 20 * 0.85 from rooms;

is not the same as:

SELECT (rate + 20) * 0.85 from rooms;

We can override the normal precedence by using parentheses (...) around parts of the expression, just as in JavaScript.

With compound predicates AND takes precedence over OR, so that to make the query give the intended results we need to use:

SELECT * FROM rooms
   WHERE (room_type = 'PREMIER'
      OR rate < 100.00)
     AND room_no < 300;

More Predicate Types

The BETWEEN operator has the form a BETWEEN b AND c : checks that a is in the range b - c inclusive. For example:

SELECT ... WHERE price BETWEEN 100 AND 250 ...

Note that the AND in this case is not combining multiple predicates, it’s part of the BETWEEN operator.

The IN operator, a IN (b, c, d, ...) checks if the value of a is equal to any of b, c, d, etc… For example:

SELECT ... WHERE room_no IN (201, 202, 204, 206) ...

Both the BETWEEN and the IN operators can be inverted using:

  ... a NOT BETWEEN b AND c ...

  ... a NOT IN (b, c, d, ...)

The LIKE operator tests for a match against a wildcard string as a LIKE b where a is being tested and b is the wildcard string. The wildcard string contains text to be matched along with wildcard symbols ‘%’ and ‘_’.

  • % (percent) matches any number of any characters
  • _ (underscore) matches exactly one of any character

For example:

name LIKE 'A%' matches names starting with ‘A’

name LIKE '_a%' matches names that have ‘a’ as the 2nd character (note the initial underscore ‘_’)

name LIKE '%ow%' matches names containing the sequence ‘ow’ anywhere in the name

LIKE can be inverted using a NOT LIKE b

If you need to match for a string that includes one of the wildard characters you can use the ’escape’ character, which defaults to ‘\’ (backslash). For example:

str LIKE '% discount = 5\% %' matches any value in str that contains ‘discount = 5%’

LIKE is case sensitive in many SQL implementations so to make a case insensitive match you should either convert the tested value to either all upper or all lower case, for example:

lower(name) LIKE '%b%' matches any name that contains the letters B or b

Note: PostgreSQL also has the non-standard operator ILIKE that can perform a case-insensitive comparison - but avoid this to make code more portable.

activity

  1. Which customers are from Norway?
  2. Which rooms can accommodate more than two people?
  3. Which invoices are dated after one month ago?
  4. How would last month’s invoices change if we gave a discount of 15%
  5. List all customers whose second name starts with ‘M’ (hint: there’s a space before the second name)

Using SQL Functions

You can use the built-in functions of SQL just as you can in JavaScript, but note that they are different (this is true of most programming languages) but there are also differences between SQL implementations.

You use functions to change values, usually of columns, wherever you can use a column, for example, in the selected list of values:

SELECT name, length(name) AS namelen, upper(email)
  FROM customers;

This query also uses a column alias (namelen) to provide a meaningful column heading.

Functions are available that operate on all different datatypes.

Country names are mixed case so to make sure we always match regardless of the stored case we can use the lower function to find all customers from Manchester, UK:

SELECT * FROM customers
   WHERE lower(country) = 'uk'
     AND city = 'Manchester';

Assuming room rates include VAT at 20%, list room rates after VAT increases to 23.5% (from 20%), but round to the nearest pound:

SELECT room_no, room_type, rate AS old_rate,
       round(rate * 100/120 * 123.5/100) AS new_rate
   FROM rooms;

For further information on SQL functions see the official PostgreSQL documentation at https://www.postgresql.org/docs/12/functions.html (for version 12 - for other versions change 12 to the required version)


Date and Time in SQL

In SQL dates and times are held in an internal format but are represented externally (when entering values and displaying them) as strings;

  • Text date format: ‘YYYY-MM-DD’ e.g. ‘2018-07-21’ = 21 July 2018
  • Time format: ‘HH:mm:SS.ddd’ e.g. ‘14:32’
  • Date/Time format: ‘YYYY-MM-DD HH:mm:SS.ddd’ e.g. ‘2018-07-21 15:26:04’

You can perform arithmetic on dates and times, for example:

SELECT cust_id, room_no, checkin_date,
       checkout_date - checkin_date AS nights
   FROM reservations
   WHERE checkout_date = current_date + 1;

This query performs subtraction of one date from another (checkout_date - checkin_date) to calculate the number of nights the customer has stayed. It also performs addition (current_date + 1) to get tomorrow’s date so that it lists all reservations that will be checking out tomorrow.

Note: current_date is a postgres function that returns the current date.

Also note that there are many ways to get the same result - you may explore those for yourself.

You can also represent time intervals but the representations can be complicated and we shall not cover them here.

activity

  1. Write a query to check that all booking dates are before their checkin dates
  2. We plan to offer a discount of 10% on all Premier and Premier Plus rooms next month. How much would we gain on each room if occupancy rose by 5 nights over the month.
  3. List all reservations for this month and the number of nights booked.

Eliminating Duplicates

“Which nationalities visit our hotel?”:

SELECT country FROM customers;

But how many values do you see returned for each country? If two customers come from a particular country that country will appear twice in the output. If more than two come from the same country then… But we only need to know the different countries.

To see each country only once, use the keyword DISTINCT, as follows:

SELECT DISTINCT country FROM customers;

The keyword DISTINCT must appear immediately after the keyword SELECT. If more than one column is selected then DISTINCT applies to the combined values of those columns.


Ordering the Returned Rows

If you want to see the data in a specific order, e.g. “List all customers alphabetically by name within each country”:

SELECT id, name, phone, email, country
    FROM customers
    ORDER BY country, name;

You can can add ASC (ascending, the default) or DESC (descending) after each column name in the ORDER BY clause to control the direction of sorting.

For example:

SELECT id, name, country, city
    FROM customers
    ORDER BY country DESC, city;

This will sort the data into descending alphabetic order of country then ascending order of city name within each country. The output will look something like this:

 id  |          name           |   country    |       city
-----+-------------------------+--------------+-------------------
  28 | Kelvin Leong            | USA          | Allentown
  96 | Juri Yoshido            | USA          | Boston
 132 | Valarie Franco          | USA          | Boston
 100 | Allen Nelson            | USA          | Brickhaven
  46 | Miguel Barajas          | USA          | Brickhaven
  43 | Leslie Taylor           | USA          | Brickhaven
  37 | Julie King              | USA          | Bridgewater
 130 | Sue Taylor              | USA          | Brisbane
 124 | Steve Thompson          | USA          | Burbank
  29 | Juri Hashimoto          | USA          | Burlingame
  36 | Jerry Tseng             | USA          | Cambridge
  70 | Marta Hernandez         | USA          | Cambridge
 112 | Dan Lewis               | USA          | Glendale
  52 | Mary Young              | USA          | Glendale
  13 | Jean King               | USA          | Las Vegas
  89 | Brian Chandler          | USA          | Los Angeles
  97 | Dorothy Young           | USA          | Nashua
  83 | William Brown           | USA          | Newark
 120 | Violeta Benitez         | USA          | New Bedford
  79 | Wing Huang              | USA          | New Bedford
 116 | Leslie Murphy           | USA          | New Haven
       . . .

Note: you can order by columns that are not returned by the query.

Limiting the Number of Rows

You can reduce the number of rows returned by using the LIMIT clause at the end of the query:

SELECT id, name, phone, email, country
  FROM customers
  ORDER BY country, name
  LIMIT 20;

The LIMIT clause is not normally used without the ORDER BY clause - without the ORDER BY clause rows can be returned in any arbitrary sequence.

Not all SQL implementations of SQL support LIMIT, some use TOP while Oracle uses ROWNUM.

activity

  1. List the different room types and rates for all rooms avoiding duplicates.
  2. List customers’ names addresses and phone numbers in alphabetic order of names.
  3. List customers’ names, addresses, city and country in ascending order of country then reverse order of city within country.
  4. List the room number, type and the cost of staying 5 nights in each of the top 15 most expensive rooms.

Summary

In this lesson you have learned the use of databases and how relational databases are structured. You’ve also learned how to use basic single-table query commands in SQL and some of the special ‘backslash’ commands in psql. You have used the SELECT command to control the columns and values that are returned, the DISTINCT, ORDER BY and LIMIT clauses to control the order and numbers of rows returned and you’ve used the WHERE clause to choose the rows that you access. You have learned the INSERT command to add new data to the database

Next time we shall go on to more complex query constructs including joins, updates and deletes along with incorporating SQL into a node.js server.

Communicating with the database using SQL

Learning Objectives

Use the tables and data from your The Docs_hotel database. If you need to start from a clean state for your database, run psql -d The Docs_hotel -f build_hotel.sql.

Earlier, we created a new NodeJS project called The Docs-hotels-api with a single API endpoint /customers to get the list of all customers. Now, we will add other endpoints with more functionalities to interact with the The Docs_hotel database.

You should have a server.js file that looks something like this:

const express = require("express");
const app = express();
const { Pool } = require("pg");

const db = new Pool({
  user: "keith",
  host: "localhost",
  database: "The Docs_hotel",
  password: "",
  port: 5432,
});

app.get("/customers", function (req, res) {
  db.query("SELECT * FROM customers")
    .then((result) => {
      res.status(200).json({ customers: result.rows });
    })
    .catch((err) => {
      console.log(err);
    });
});

app.listen(3000, function () {
  console.log("Server is listening on port 3000. Ready to accept requests!");
});

Before we move on, run the server.js file in node and check the endpoint using your browser.

Using Postman to Test an API

You should have Postman already installed on your system. If not then please install it now.

Run Postman (it can be slow to start).

postman get cust all

Check the method is GET, set the URL for your customers endpoint then click Send.

postman get cust all results

Creating a table

Learning Objectives

Use the CREATE TABLE command, which in the simplest case has the general form:

CREATE TABLE <tablename> <column definition>, <column definition>, ...);

To create an inventory table for our hotel we might need:

CREATE TABLE inventory (
  id            SERIAL PRIMARY KEY,
  description   VARCHAR(30) NOT NULL,
  cost          NUMERIC(6,2)
);

Note: you may never need to do this. Database design is a task that requires specialist skills and considerable experience.

Naming Tables and Columns

In the CREATE TABLE command you must give the name of the table (e.g. inventory) and the names of each of the columns (in the parenthesised column definitions) (e.g. id, description, cost).

Names of tables and columns (and any other objects in the database) must start with a letter, can contain letters, digits and the underscore symbol (_) up to 64 bytes (in PostgreSQL). Names are not case-sensitive so that NAME, name and NaMe are all the same.

Data Types of Columns

In the above example:

ColumnData TypeOther
idSERIALPRIMARY KEY
descriptionVARCHAR(30)NOT NULL
costNUMERIC(6,2)

The id column uses SERIAL as its data type, making it an autoincrementing integer that increments by 1, starting from 1, for each time a new row is inserted into the table. For this to work, the id column must be omitted from the INSERT command. id is also designated as the PRIMARY KEY of the table (note that SERIAL doesn’t make the column the primary key). PRIMARY KEY also implies that the column cannot be set to NULL.

The description column is a variable length character value (VARCHAR) that can hold up to a maximum of 30 characters. The NOT NULL constraint means the value cannot be left empty, each row must have a description.

The cost column is NUMERIC(6,2), a number that can accurately store up to 6 digits, two of which are the fractional part. For example, it can hold 1234.56, -23.45 or 0.01. Note that the NUMERIC data type stores and computes values using decimal values and does not lose accuracy in the same was as, say, floating point values. NUMERIC values take longer in calculations because they don’t use simple binary values - user either integer or floating point for speed with compute-heavy numbers.

NEVER use floating point for financial values.

Other Common Data Types

There are several more standard data types (plus a few non-standard ones), including:

TypeNotes
INTEGERbinary integer with 32 bits (range approx -2 x 109 – +2 x 109)
DATEdates with no time component
TIMESTAMPdate and time (accurate to milliseconds)
BOOLEANTRUE, FALSE or NULL
TEXTvariable length text with no length limit (up to max allowed for the RDBMS - about 1Gb in PostgreSQL)

You can read more about data types in the PostgreSQL documentation. Refer to https://www.postgresql.org/docs/12/datatype.html for more information.

Changing a Table Definition

Using the ALTER TABLE command to add and remove columns:

ALTER TABLE inventory ADD COLUMN room_no INTEGER;

ALTER TABLE customers DROP COLUMN phone;

There are some constraints on adding and removing columns, for example, you cannot add a NOT NULL column to a table that already contains some rows.


Exercise 7

  1. Create a table for charge points. This must record the hotel shops, bars, cafes and restaurants that a customer can use during their stay.
  2. Include an auto-incrementing primary key
  3. Include the charge point name, a description and maximum customer credit value
  4. Insert charge points for ‘Shop’, ‘Pool Bar’, ‘Elysium Restaurant’ and ‘Room Service’ with credit limits of £1000 for each.
  5. Create a table for charge items that records amounts charged to rooms by customers using our shop, bars, restaurants, etc. This must include the room number of the room charged, the charge point used, the amount, the date and time of the charge and any customer comments.

CRUD operations with NodeJS and PostgreSQL

Learning Objectives

👩🏽‍✈️👨🏾‍✈️Code along with video mentors


💾 CRUD operations

table-diagram
Create, Read, Update, Delete: CRUD 🧶 🧶 CRUD “The acronym CRUD refers to all of the major functions that are implemented in relational database applications. Each letter in the acronym can map to a standard Structured Query Language (SQL) statement and Hypertext Transfer Protocol (HTTP) method […].”_ - Wikipedia

🌱 Creating data

In the following, we will add a new API endpoint to create a new customer in the table customers of the The Docs_hotel database. As a reminder, here is an example of a SQL insert statement to add a new customer:

INSERT INTO customers (name, email, phone, address, city, postcode, country)
  VALUES ('Fred Bloggs', 'fred@bloggs.org', '07123456789', '1 Low Brow',
          'Ashbottom', 'XY2 3ZA', 'UK');

Using body-parser for Form Data

When an endpoint can reasonably expect a large number of data values, for example, when we need to insert a new row into the customers table, then using parameters in the URL is not practical. Instead we can use a ‘middleware’ package such as body-parser, which can extract data from the body of the request.

As we create a new record in the database, we will add a new POST endpoint in the The Docs-hotels-api project from last class. Moreover, we need to be able to pass data to this API endpoint such as the name, address, postcode, etcetera, so we can use this API to create different customers. The data can be sent in the body of the request. To access the parameters in the body of the request with Express.JS, we need to add the module body-parser to the The Docs-hotels-api project:

At the terminal command line:

npm install --save body-parser

Then include it in the server.js:

const bodyParser = require("body-parser");
app.use(bodyParser.json());

📩 Inserting Rows Using Node.js

We can finally make an endpoint to create a new customer:

app.post("/customers", function (req, res) {
  const newName = req.body.name;
  const newEmail = req.body.email;
  const newPhone = req.body.phone;
  ...
  const newCountry = req.body.country;

  const query =
    `INSERT INTO customers (name, email, phone, address, city, postcode, country)
    VALUES ($1, $2, $3, $4, $5, $6, $7)`;

  db.query(query, [newName, newEmail, ..., newCountry])
    .then(() => {
      res.status(201).send("Created a new customer");
    })
    .catch(err => {
      console.log(err);
    })
});

📮 Using Postman to Test a POST

In Postman:

  • Change the method to POST
  • Ensure the URL address is your customers endpoint, for example: http://localhost:3000/customers
  • Under the address in the Postman window select Body then select raw
  • Ensure that JSON is selected at the right hand end
  • Enter the data as JSON in the panel below these settings, for example:
{ "name": "Fred Bloggs", "email": "fred@bloggs.org", "phone": "07123456789" }
  • Click send

You should see the response “Customer created.”

exercise

  1. Install body-parser and enable it in your server.js
  2. Define a new endpoint to create new customer records For simplicity, only insert the name, phone & email values. Invent your own values.
  3. After successfully inserting the row send back a message, “New customer added.”
  4. Test your endpoint using Postman
  5. Check the data has appeared in the table

What Can We Do After an Insert?

  • We must start checking for errors in the execution of the SQL. This applies to all commands, not just INSERT.
  • If the table has an auto-incrementing primary key we can obtain the value for the new row and return it to the browser. This is often required for subsequent actions in the use case.
  • The new data can be logged to the console (although this is not common practice).
  • … and so forth …

🧪 Checking for Errors in SQL

The catch block in db.query always has an error parameter. If the passed SQL query throws an error then the callback function passed to the catch method is invoked with an error object generated by node postgres. A typical situation might be:

  db
    .query(...);
    .then(/* do things for success*/);
    .catch((err) => res.status(500).json({error: err}));

👩🏽‍🔬 Validating

What could go wrong with our code so far? There is no validation of any user inputs which could result in errors, duplications or inconsistent data in the database! Here are few examples of things we can verify before creating the customer in the database:

  1. Check that the phone number contains only digits, ‘+’, ‘-’, ‘(’ or ‘)’
  2. Check that no customer with the same email already exists in the database

Let’s start by validating that phone number the consists of only digits, +, -, (, ) or space and if it doesn’t, return an error. This check uses a regular expression to replace all valid characters with a ‘0’ (zero) symbol then check the result is made up of only ‘0’ symbols.

if (
  newCustPhone.replace(/[+\-()0-9 ]/g, "0") != // replace all valid chars with 0
  "0".padEnd(newCustPhone.length, "0")
) {
  // compare with all zeros same length as phone
  return res
    .status(400)
    .send("The phone number must only contain 0-9, +, -, (, ) or space.");
}

Note: regular expressions may be new to you but they are very powerful and also quite complicated. We don’t have time to teach you the intricacies of “regex” here but you can find numerous resources on the web and in the MDN documentation.

As you have already learned, this validation can also be performed in the browser because it doesn’t involve any database interaction. This is a faster and less costly (in terms of network traffic) approach and should be used where practical. It makes user correction of mistakes quicker and more natural and can be done on a per item basis.

exercise

  1. Modify your endpoint to check for SQL errors. If an error is encountered then console.log it and return it to the browser.
  2. Add phone number validation to your endpoint to check that only digits, +, -, (, ) or space are present. Any other characters should send a suitable message back to the browser and abort the insert.
    (You may use copy & paste to get the regular expression - it’s not easy to get right by hand.)
  3. Test your endpoint with a variety of data including valid and invalid phone numbers.

📨 Check the Customer’s Email

We can also validate the new customer’s email doesn’t already exist in the database, to prevent duplicate data.

app.post("/customers", function (req, res) {
  const newName = req.body.name;
  const newEmail = req.body.email;
  const newPhone = req.body.phone;
  // et cetera ...

  //
  // Validate the new customer's email address by querying the existing customers table
  // to return any rows that contain the same values
  //
db.query("SELECT 1 FROM customers WHERE email=$1", [newEmail])
  .then((result) => {
    if (result.rowCount > 0) {      // note the use of result.rowCount
      return Promise.reject({error: 'Customer already exists'})
    } else {
      return db.query(`INSERT INTO customers (name, email, phone, address, city, postcode, country)
        VALUES ($1, $2, $3, $4, $5, $6, $7)`, [newName, newEmail, ..., newCountry],
    }
  })
  .then(() => {
    res.status(201).send("Customer created")
  })
  .catch((error) => {
    // any errors or manual rejections will be intercepted here
    // handle error here
  })

Note:

  • The SELECT does not choose any column values, just the literal 1. We are only interested in the number of rows found (it must be 0 for success).
  • If a row is found, then a customer with this email exists so we call Promise.reject: this will then trigger the first catch block in the promise chain where we can handle the error.
  • If a row is not found, then we perform the insertion query, taking care to return the call to db.query so we can access its resolve value in the next .then block
  • This validation is NOT suitable for performing in the browser because it interacts with the database.

exercise

  1. Before adding a new customer ensure there are none with the same email address in the customers table
  2. If a duplicate email is found send an appropriate message to the browser
  3. Use Postman to check that your new code works as expected by trying to insert duplicate emails as well as correct ones
  4. Check the data has been added to the database (use psql)

🗝️ Return the Primary Key Value for an Insert

If the table uses an auto-incrementing primary key then it is very useful to return the generated value after the INSERT. PostgreSQL provides a very simple mechanism to do this (as do most RDBMS’s) by appending the RETURNING clause to the INSERT, as below:

INSERT INTO customers (name, email, phone)
  VALUES ('Fred Bloggs', 'fred@bloggs.org', '07123456789')
  RETURNING id;

The endpoint now uses the query call shown below:

  db.query(`INSERT INTO customers (name, email, phone, address, city, postcode, country)
    VALUES ($1, $2, $3, $4, $5, $6, $7)
    RETURNING id`, [newName, newEmail, ..., newCountry])
    .then(result => {
        const newId = result.rows[0].id;
        console.log(`New Customer id = ${newId}`);
        res.status(200).json({lastId: newId});
    })
    .catch(err => {
      res.status(500).json({error: err});
    })

exercise

  1. Further extend your POST endpoint so that it returns the new customer id value to the browser.
  2. Use Postman to check that the new value is returned.

🆕 Updating data

We can now implement an endpoint to update a customer record in the database. For this, we will use a PUT endpoint.

app.put("/customers/:id", function (req, res) {
  const custId = req.params.id;
  const newEmail = req.body.email;
  const newPhone = req.body.phone;

  db.query("UPDATE customers SET email=$2, phone = $3 WHERE id=$1", [
    custId,
    newEmail,
    newPhone,
  ])
    .then(() => res.send(`Customer ${customerId} updated!`))
    .catch((err) => {
      console.error(err);
      res.status(500).json({ error: err });
    });
});

Note:

  • This endpoint uses both a URL parameter (:id) to supply the customer id and data in the request body to provide the updated column values.
  • The order of the placeholders does not have to correspond to their sequence in the SQL. In this case we’ve put the custId value first but it appears last in the SQL (in the WHERE clause).

We used URL parameters to query for a specific customer earlier and we used body data to insert a new customer.

What can go wrong in the code above? Again, there is no validation! The user could try to set an empty email or even a string which is not following the format of an email. Remember, validating data is very important to make sure you don’t end up with inconsistent data in your database!

exercise

  1. Create a new endpoint to update reservations to provide the room number allocated to the guest when they check in. Use the reservation id value to identify the row.
  2. Use Postman to check that the endpoint works correctly and the row is updated.

🗑️ Deleting data

To delete a record from the database, we will use a DELETE endpoint:

app.delete("/customers/:id", function (req, res) {
  const custId = req.params.id;

  db.query("DELETE FROM customers WHERE id=$1", [custId])
    .then(() => res.send(`Customer ${custId} deleted!`))
    .catch((err) => console.error(err));
});

However, if you try to delete a customer which already has some bookings, the previous endpoint will fail. Do you know why? You cannot delete a customer whose ID is used as a foreign key in another table (in this case, in the reservations table). Let’s delete all the customer bookings first:

app.delete("/customers/:customerId", function (req, res) {
  const customerId = req.params.customerId;

  db.query("DELETE FROM reservations WHERE cust_id=$1", [customerId])
    .then(() => {
      return db.query("DELETE FROM customers WHERE id=$1", [customerId]);
    })
    .then(() => res.send(`Customer ${customerId} deleted!`))
    .catch((e) => console.error(e));
});

exercise

  1. Add an endpoint to delete from the reservations table using the value of the id to choose the row to be removed. Only allow deletion of reservations with future checking dates otherwise return an error.
  2. Check the endpoint works correctly using Postman (use psql to verify the row has been deleted).

⚠️ Warning!

⚠️ warning

This approach we’ve used to update and delete rows is not safe in a typical multi-user environment. If more than one user at a time is able to perform updates and deletes there could be unintended data loss using this approach.

Optimistic Locking

The solution is known as “optimistic locking” but we just don’t have time in this course to cover it. So long as you are aware of the problem you can use the approach but add plenty of comments to make clear that the code is not yet safe.

Note also that optimistic locking requires changes not only to the code of your Node endpoints but also to the behaviour of the web pages that use and modify the data.

Defining Primary and Foreign Keys

Learning Objectives

Defining Primary and Foreign Keys

Defining Primary Keys

Use the following templates to define a Primary Key.

For a single-column PK use:

CREATE TABLE <table name> (
  ...
  <column name>   <data type>   PRIMARY KEY,
  ...
)

For example:

CREATE TABLE rooms (
  room_no       INTEGER   PRIMARY KEY,
  ...
);

To define a multi-column primary key you must define a constraint separate from the column definitions, as below:

CREATE TABLE <table name> (
  ...
  <pk col 1>     <data type>,
  <pk col 2>     <data type>,
  ... ,
  PRIMARY KEY (<pk col 1>, <pk col 2>),
  ...
);

For example:

CREATE TABLE invoice_items (
  inv_id        INTEGER REFERENCES invoices(id),
  item_no       INTEGER,
  ... ,
  PRIMARY KEY (inv_id, item_no),
  ...
);

There can be only one primary key in a table definition. The PRIMARY KEY definition implies NOT NULL so no column in a table’s PK can be set to NULL.

Note: a partial primary key can be a foreign key as well.

Defining Foreign Keys

To define foreign keys use either:

For a single-column foreign key:

  <column name>   <data type>   REFERENCES <table name> (<column name>);

where the <column name> in the REFERENCES clause is the column name in the referenced table, not the one being defined at this point. For example, in the reservations table:

  ...
  cust_id         INTEGER NOT NULL   REFERENCES customers (id),
  ...

For multi-column foreign keys we must again use a separate constraint definition, as shown:

CREATE TABLE customer_challenges (
  id           SERIAL PRIMARY KEY,
  inv_id       INTEGER,
  item_no      INTEGER,
  ...
  FOREIGN KEY (inv_id, item_no) REFERENCES invoice_items (inv_id, item_no),
  ...
);

Exercise 8

  1. Try to delete the customer Mary Saveley. What happens and why?
  2. Insert a new room, number 313 as room type ‘SUPER PREMIER’.
  3. Define primary and foreign keys required by the charge_items table
  4. Insert some rows into the charge_items table. You can invent the details.

Inserting, Updating and Deleting Rows

Learning Objectives

Inserting data

To add new data to a table use the INSERT command that has the following format:

INSERT INTO table_name (column_name, ...)
       VALUES (value, ...)

For example:

INSERT INTO customers (name, email, address, city, postcode, country)
  VALUES ('John Smith','j.smith@johnsmith.org',
          '11 New Road','Liverpool','L10 2AB','UK');

Note:

  1. You do not need to supply the value for the automatically generated id column, it is populated from a sequence generator object.
  2. The order of values in the VALUES (...) clause must correspond to the columns in the column name list. The first value is stored in the first named column, the second value in the second named column and so forth.

exercise

  1. Insert yourself in the customers table. Query the table to check your new data.
  2. Insert a new room type of PENTHOUSE with a default rate of 185.00.

Updating Existing Data

When you need to change values in a table, use the UPDATE command. The general construction to update a row is:

UPDATE table
  SET column1 = value1,
      column2 = value2
  WHERE condition;

Note that UPDATE usually requires a WHERE clause to specify the row or rows to be updated. As with SELECT, if you don’t specify a condition to restrict the rows, the command applies to all the rows in the table.

For example, to update the name and country of the customer with ID 3:

UPDATE customers
  SET name='Bob Marley',
      country='Jamaica'
  WHERE id=3;

exercise

  1. Update the postcode of the customer named Alice Evans to M21 8UP
  2. Update room 107 to allow up to 3 guests
  3. For the customer named Nadia Sethuraman, update her address to 2 Blue Street, her city to Glasgow and her postcode to G12 1AB in one query
  4. Update all the future bookings of customer with ID 96 to 3 nights (starting on the same check-in date) in one query

Deleting a row

The syntax to delete a row is:

DELETE FROM table WHERE condition;

For example, to delete the booking with ID 4:

DELETE FROM reservations WHERE id=4;

💡 tip

If you don’t supply a WHERE clause with DELETE or UPDATE the command will be applied to all the rows in the table which is rarely what you want.

exercise

  1. Delete the bookings of customer ID 108 that do not have a room number assigned
  2. Delete all the bookings of customer Juri Yoshido (customer id 96)
  3. Delete the customer details for Juri Yoshido

Integration with NodeJS

Learning Objectives

Introduction to node-postgres

“node-postgres is a collection of node.js modules for interfacing with your PostgreSQL database.” - https://node-postgres.com/

In the following, we will use node-postgres to…

  1. Connect to a database
  2. Send SQL queries to the database and get results

Loading data from a database with a GET endpoint

Let’s build a brand new NodeJS application with a single GET endpoint to load the list of customers that you already have in the customers table of the The Docs_hotels database.

First, create a new NodeJS application that we will call The Docs-hotels-api (enter server.js when asked for the entry point):

mkdir The Docs-hotels-api
cd The Docs-hotels-api
npm init

As before, we will use the Express library to build our API, and the node-postgres library to connect with our database:

npm install --save express
npm install --save pg

Create a server.js file, import express, initialise the server and start listening for requests:

const express = require("express");
const app = express();

app.listen(3000, function () {
  console.log("Server is listening on port 3000. Ready to accept requests!");
});

Import pg library and create a new GET endpoint to load the list of customers:

const { Pool } = require("pg");

const db = new Pool({
  user: "keith", // replace with you username
  host: "localhost",
  database: "The Docs_hotels",
  password: "",
  port: 5432,
});

app.get("/customers", function (req, res) {
  db.query("SELECT id, name, city, phone FROM customers")
    .then((result) => {
      res.json(result.rows);
    })
    .catch((error) => {
      console.log(error);
    });
});

In the code above:

  • We first import the Pool class from the pg library, which is used to connect to a database
  • We create a new connection (db) where we specify the credentials to connect to the The Docs_hotel database
  • We then create a new /customers endpoint where we use the method query() to send a SQL query to load all the customers from the table customers and return the results with result.rows. The query method returns a Promise: so we can access the returned rows using a .then block. You can write any valid SQL query that you learned in the query() method!

Note: There is no semicolon to end the SQL statement.

Start your server with node server.js and try to reach the /customers endpoint to see the list of customers currently available in your The Docs_hotels database. You can try to create/update/delete customers using psql to verify that your API always returns what is stored in your database.

Introduction to databases

Learning Objectives

A database is a structured set of data held in a computer. It provides ways to store, retrieve and organize information.

Why do we need them?

In the past few weeks, you stored and retrieved data using files. This is fine for simple data but it can quickly become an issue as your application becomes more complex and needs to store and manipulate more complicated data. For example, imagine you are asked to develop a booking system for a medium-sized hotel. You will need to store the list of rooms available for booking, and as you add more features, you will need to save users information, the reviews they post, the bookings each user makes and payment information. You can see that the data you need to handle can become very complicated, especially when you need to consider that data are not static, as they can be updated or deleted. To work more effectively with data, we can then use a database, which presents the following benefits:

  • A database defines a structure for your data and the relationships between entities
  • A database provides convenient and performant ways to safely store and retrieve data
  • A database provides a mechanism to check the validity of your data

Different types of database

There are many different kinds of database and different implementations. Sometimes, a database type is a better fit to certain use case or certain problems. The most well-known database types include relational database, key/value database, graph database and document database (also known as NoSQL). For this class, we will focus specifically on relational database as they are the most widely used and supported. You can consult DB-Engines to see a ranking of the most used database, as you can see, there are a lot of them!

Introduction to PostgreSQL

Learning Objectives

“PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.” (source: postgresql.org)

What is SQL?

  • Pronounced S-Q-L or sequel
  • Stands for Structured Query Language
  • SQL is the standard language used to communicate with relational database
  • SQL statements are used to query, create, update, delete records in a database as well as many other tasks
  • SQL statements are executed by a RDBMS.

What is an RDBMS?

  • Stands for Relational Database Management System
  • It is a program that processes SQL statements to manage a relational database
  • PostgreSQL is an RDBMS.

What characterizes a relational database?

As mentioned previously, a relational database is a specific type of database. Data is stored in tables (relations) of rows (tuples) and columns (attributes) as per the example below:

table-diagram

Joining Tables

Learning Objectives

Introduction

So far we’ve only looked at one table in any query. Many problems require data from several tables - how do we do that?

For example, if I want to phone or email customers who have not yet paid their invoices, which tables do I need to look at?

Use joins to combine data from more than one table. Joins use column values to match rows in one table to rows in another.

The join columns are usually referred to as foreign keys and primary keys.

🖼️

Join Diagram

Foreign and Primary Keys

Each table should have a Primary Key. This is one or more columns whose values, which cannot be NULL, are combined to provide a unique identifying value for each row. Natural primary keys are often difficult to find so many tables use an arbitrary integer whose value is automatically generated when the row is created. When joining tables we need to match a single row to one or more other rows, usually in another table - for example, matching a customer to her/his reservations. The single row (customer) is usually identified by its primary key value.

Foreign Keys are the columns in a table that reference corresponding columns in another table (although self-referencing foreign keys can reference the same table). For example, the res_id column in the invoices table references the id column in the reservations table (see diagram above).

The referenced column is almost always the primary key of the referenced table because a foreign key must always reference exactly one row in the referenced table (primary keys guarantee that).

Using JOIN in SQL

To join reservations and invoices in SQL:

SELECT r.cust_id, r.room_no, i.invoice_date, i.total
  FROM reservations r JOIN
       invoices i ON (r.id = i.res_id);

Notice:

  • The new keyword JOIN with ON (predicate)
  • Table aliases (r and i) used to qualify columns

The new syntax follows the following pattern:

SELECT ...
  FROM ... [JOIN ... ON (...)]...
  [WHERE ...]
  [GROUP BY ... [HAVING ...] ]
  [ORDER BY ...]

Use the JOIN to define the combined row source then you can use WHERE, DISTINCT, GROUP BY, ORDER BY, etc… as with single-table queries. For example:

    SELECT r.cust_id, r.room_no, i.invoice_date, i.total
      FROM reservations r JOIN
           invoices i ON (i.res_id = r.id)
      WHERE r.checkin_date > '2018-07-01'
        AND i.total < 500
      ORDER BY i.invoice_date DESC, r.cust_id;

There is no theoretical limit to the number of tables that can be joined in a query, although practical considerations like complexity and performance must be considered. It is quite common, though, to find up to seven or eight tables joined in a query.

Multi-table joins just extend the syntax to add more tables, as below:

SELECT c.name, c.phone, c.email, i.invoice_date, i.total
  FROM customers c JOIN
       reservations r ON (r.cust_id = c.id) JOIN
       invoices i ON (r.id = i.res_id)
  WHERE i.invoice_date < current_date - interval '1 month'
    AND i.paid = FALSE
  ORDER BY i.invoice_date DESC, c.id;

Note You have just learned about what is called the INNER JOIN, which is the most common kind of join. Indeed, you can use the keyword INNER in the JOIN syntax, as follows:

SELECT c.name, c.phone, c.email, i.invoice_date, i.total
  FROM customers c INNER JOIN
       reservations r ON (r.cust_id = c.id) INNER JOIN
       invoices i ON (r.id = i.res_id)
  WHERE i.invoice_date < current_date - interval '1 month'
    AND i.paid = FALSE
  ORDER BY i.invoice_date DESC, c.id;

The INNER keyword is not required (it’s the default) but some organisations might require it for the sake of coding standards.

There are other kinds of JOIN, specifically the OUTER JOIN and the CROSS JOIN but these are less frequently used in applications. If you want to find out about these kinds of JOIN refer to the PostgreSQL documentation.


Exercise 5

  1. Try and understand each of the queries above in your psql prompt
  2. Which customers occupied room 111 and what are their details?
  3. List the customer name, room details (room number, type and rate), nights stay and departure dates for all UK customers.
  4. List name, phone and email along with all reservations and invoices for customer Mary Saveley.

The Vexing Question of NULL

A column can be assigned a NULL value to indicate it has no value. This can happen when the data for this column is unknown at the time the row is created, for example, employee leaving date, order shipment date, etc… It can also be used when the data is optional.

Be careful with expressions - any expression that includes a NULL value results in NULL as the expression value.

Because NULL is ’no value’ it cannot be compared to anything else. For example, you will never get any results from:

SELECT * FROM customers WHERE postcode = NULL;

nor will you get any from:

SELECT * FROM customers WHERE postcode != NULL;

Instead you must use:

  ... WHERE postcode IS NULL

or

  ... WHERE postcode IS NOT NULL

This behaviour has some impacts on operations like JOIN, where NULL values won’t match. You could work around this, but see the warning below, by using:

  ... ON (a.col = b.col OR
          a.col IS NULL AND b.col IS NULL)

WARNING: However, be aware that this is not a sensible situation - join columns containing NULL should be expected to not match or should be disallowed (see Primary Keys above)

You can explicitly provide NULL as a value in INSERT and UPDATE statements, for example:

    INSERT INTO rooms (room_no, rate, room_type, no_guests)
      VALUES (213, 95.00, NULL, 2);

    UPDATE rooms SET room_type = NULL, no_guests = NULL
      WHERE room_no = 204;

In INSERT statements if you omit a column from the column list (following the table name) then that column will be given either:

  • an autogenerated value (if it has datatype SERIAL)
  • a default value if one has been specified in the CREATE TABLE command
  • NULL if neither of the above apply

Functions to Handle NULL

There are some functions that can operate on NULL values, especially the coalesce(x, y) function. This function looks at the first argument x and if it is NULL returns the value of the second argument y otherwise it returns the value of x. For example:

SELECT room_no, rate, coalesce(room_type, 'None') type
  FROM rooms
  WHERE no_guests IS NULL;

Notes:

  • The coalesce function can take more than two arguments and returns the first of these (from left to right) that is not null.
  • This feature is provided by most SQL vendors but goes by different names, e.g. ifnull(x, y) in MySQL, nvl(x, y) in Oracle, etc…

Exercise 6

  1. Which customers have not yet provided a phone number?
  2. Update room 304 such that it does not have a room_type.
  3. List customers (name and city) qand their reservations replacing the room number with ‘Not Assigned’ if it is NULL.

Mac-OS setup ⚙️

Learning Objectives

With the Apple Mac you can choose either a graphical installation or a command line method. The graphical method mostly involves dragging and dropping icons with a little bit of command line to configure the tools we need. The command line method uses only command line but is fairly simple.

Graphical Installation

Note that you must be logged in as a user with Administrator privileges on OSX for all these steps.

From the Macintosh OSX system you can install PostgreSQL from the Postgres.app package. Download this from https://postgresapp.com but make sure you choose PostgreSQL version 12 and the appropriate platform for your Mac. Once you have downloaded Postgres you can open the disk image from your Downloads folder. Drag the Postgres icon to the Applications folder.

Postgres Disk Image

Open Applications in the Finder and start Postgres. When you receive the alert “Postgres” is an application downloaded from the internet… then click on the Open button. You may be prompted for other options (e.g. auto update, …) and you can choose your own preference.

When the PostgreSQL window appears then click on the Initialise button to create your local PostgreSQL database and user - they will both have the same name as your OSX username.

Initialise

PostgreSQL initialises itself, creating the initial databases (template1, postgres and <your user name>) along with corresponding users. These should then be displayed in the PostgreSQL window, as below:

Running

Configure the Command Line Interface (CLI)

To use the PostgreSQL command line tools you need to add the directory to your path (used to find commands in the terminal). There are several ways to do this but the simplest is to create a file containing the directory path name. Open the terminal and enter the appropriate commands from the list below (note: you may not need to create the directory):

# First check the destination directory exists:
ls /etc/paths.d/
# If the directory doesn't exist then create it:
sudo mkdir /etc/paths.d
# Create the path entry for Postgres:
echo /Applications/Postgres.app/Contents/Versions/latest/bin | sudo tee /etc/paths.d/postgresapp

From the terminal you can now type the simple command:

psql

and this should produce something like the following:

psql (12.4)
Type "help" for help.

keith=#

The final line is the psql command prompt, waiting for you to enter a command. It comprises your database name (created with the same name as your username, in the example this is ‘keith’) followed by ‘=#’.

You are going to rename the database to my_hotels for the purposes of this course, so from the psql prompt (as above) first disconnect from your database by connecting to database postgres:

keith=# \connect postgres

Now, still at the psql prompt, rename the database:

keith=# alter database keith rename to my_hotels;

Clearly, you’ll use your own name as the name of the database to be renamed.

Note that to exit psql back to the terminal prompt use the command \q.

MacOS Graphical Install Complete

MacOS Command Line Installation

You need Homebrew installed on your Mac to use this approach. If you don’t yet have it then you should probably use the graphical method. If you already have Homebrew installed then the following command installs PostgreSQL:

brew install postgresql

This will install the software, set up the postgres user and a user with the same name as your Mac OS user and create some databases. It should leave the PostgreSQL server running on your Mac but if you power off the machine you’ll need to restart it. To restart the server, type:

brew services start postgresql

This starts the server and ensures that it will be started each time you power up your Mac. Note you can stop the server at any time by using:

brew services stop postgresql

Ubuntu setup ⚙️

Learning Objectives

Install

If you have Ubuntu as your operating system you can install PostgreSQL using the following commands in the terminal (note: the $ is the system prompt):

sudo apt-get install postgresql postgresql-contrib

Enter your ubuntu password when asked for it (it’s needed for the sudo command).

For this command you must confirm that you are happy to install the packages and any needed dependencies that the package manager identifies. Generally just type Y or y to proceed.

To verify your installation enter the following commands:

sudo -u postgres psql --version

Again, enter your Ubuntu user password when prompted. The command should display something like this:

psql (PostgreSQL) 12.2 (Ubuntu 12.2.4)

Create Your User and Database

Create the database user with the same name as your Ubuntu username (this makes logging in a bit easier):

sudo -u postgres createuser -P --createdb <username>

This will ask for your Ubuntu login password to run the sudo command. The other parts of the line are:

  • -u postgres tells sudo to run createuser as user postgres (set up when you installed postgres) -createuser the shell command to create a PostgeSQL user
  • -P (upper case) tells createuser to ask for the new user’s password
  • --createdb grants the user the privileges to create databases
  • <username> the new username (hint: make it the same as your o/s username)

You’ll be asked for the new database user’s password (twice) - you’ll need this later for when we connect to the database from Node.

Remember the new username and password details - you will need them later.

Creating a new database

In a terminal, create a new database named my_hotels with the following command:

createdb my_hotels

Then connect to your database with:

psql my_hotels

Note that if you chose to use your Ubuntu username for the database you don’t need it (or the password) to log in to psql. If you used a different username then you’ll need to log in with:

psql -U <username> my_hotels

and you’ll be prompted for the password.

The output from this command should look something like this:

psql (12.2 (Ubuntu 12.2.4))
Type "help" for help.

my_hotels=>

The last line (my_hotels=>) is the psql command prompt, made up of your database name plus ‘=>’.

Note that to exit psql back to the terminal prompt use the command \q.

Using Aggregate Functions

Learning Objectives

Basic Aggregate Functions

How to calculate totals, averages, etc. over multiple rows.

You frequently need to get a single piece of information that is derived from multiple rows in a table. For example, when you need to know the total of all invoices for August 2018:

SELECT sum(total)
   FROM invoices
   WHERE invoice_date BETWEEN
         '2018-08-01' AND '2018-08-31';

The aggregate functions are:

  • sum : Calculate the total of the values in a column
  • avg : Calculate the average (mean) of the values in a column
  • min : Determine the mimimum value of a column
  • max : Determine the maximum value of a column
  • count : Count the number of values (non-null) in a column

All the above are in the SQL standard, most implementations provide others. SUM and AVG can only apply to numeric data, the others can apply to any datatype.

Further examples:

“What is the average length of stay at our hotel?” :

SELECT avg(checkout_date - checkin_date)
  FROM reservations;

“What are the lowest and highest room rates we charge?” :

SELECT min(rate) AS lowest,
       max(rate) AS highest
  FROM rooms;

You can use the count(x) function to count non-null values:

SELECT count(id) AS id_ct, count(postcode) AS post_ct
  FROM customers;

   id_ct | post_ct
  -------+---------
     133 |     126
  (1 row)

Notice that these two results show different values - there are NULL values for postcode but id is mandatory for all rows.

If you just want to count the number of rows, use count(*). This is often used to find how many rows match a WHERE clause:

SELECT count(*) FROM customers WHERE country = 'Belgium';

exercise

  1. Get the numbers of rows in each of the tables: rooms, room_types, customers and reservations.
  2. How many reservations do we have for next month?
  3. How many invoices are still unpaid from over a month ago and what is the total owed?
  4. What is the maximum gap in days between a customer booking a room and the checkin date for that booking?

Grouping Rows for Aggregation

You can calculate aggregates over subsets of rows using the GROUP BY clause:

SELECT count(*) FROM rooms
  GROUP BY room_type;
  count
  -------
      14
      14
       8
      10
       2
  (5 rows)

What do you notice?

The query calculated the counts correctly but we have no idea which room type each value represents. To solve this we are allowed to include the GROUP BY expressions in the list of selected values, as below:

SELECT room_type, count(*) FROM rooms
  GROUP BY room_type;
 room_type    | count
--------------+-------
 PREMIUM      |    14
 PREMIER      |    14
 PREMIER PLUS |     8
 PREMIUM PLUS |    10
 FAMILY       |     2
(5 rows)

Notice the room_type used for GROUP BY is also included in the SELECT list of values.

We can group by multiple expressions, for example:

SELECT trunc(room_no/100) AS floor,
       to_char(checkin_date, 'YYYY-MM') AS month,
       count(*), sum(no_guests), avg(no_guests)
  FROM reservations
  GROUP BY floor, month;

Notice that the GROUP BY is using the column aliases floor and month that have been defined in the select list. This works in many, but not all, SQL implementations. (In those that don’t allow aliases you must use the full expression, for example: trunc(room_no/100) instead of floor)

You can use a WHERE clause to restrict the rows that are included in the aggregate function. For example, if we need the above query for only the 2nd and 3rd floors:

SELECT trunc(room_no/100) AS floor,
       to_char(checkin_date, 'YYYY-MM') AS month,
       count(*), sum(no_guests), avg(no_guests)
  FROM reservations
  WHERE room_no BETWEEN 200 AND 399
  GROUP BY floor, month;

Note that it is NOT usually possible to use column aliases in the where condition.

A WHERE clause is applied before any aggregation, if you need to restrict results using an aggregate function you can’t do that using the WHERE clause.

In the above, to return only results with the number of reservations greater than, say, 4 we use the HAVING clause:

SELECT trunc(room_no/100) AS floor,
       to_char(checkin_date, 'YYYY-MM') AS month,
       count(*), sum(no_guests), avg(no_guests)
   FROM reservations
   GROUP BY floor, month
   HAVING count(*) > 4;    --<< Note the HAVING keyword

The order of clauses in the SELECT statement is:

SELECT ...
   FROM ...
   [WHERE ...]
   [GROUP BY ...
   [HAVING ...] ]
   [ORDER BY ...]

The square brackets indicate optional clauses. Note that HAVING is only relevant when you have a GROUP BY and must follow it in the SELECT statement.

It can be confusing at first knowing whether to use a WHERE clause or a HAVING clause with GROUP BY.

  • Use the WHERE clause when values you want to test are available without having to use any aggregate functions (e.g. plain column values).

  • Use HAVING when the values you want to test are the results of aggregate functions (e.g. count(*), sum(amount), min(x), etc…).


exercise

  1. What is the grand total of all invoices for each month?
  2. How many guests could be accommodated at one time on each floor?
  3. Which rooms have been occupied for less than 10 nights and for how many nights have they been occupied?