Basics
Hey there! So you want to learn PostgreSQL? Great choice! It’s like the Swiss Army knife of relational databases—robust, feature-rich, and incredibly reliable. Let’s dive in and get you comfortable with the essentials you’ll use almost every day.
What is PostgreSQL?
PostgreSQL (or “Postgres” for short) is an advanced open-source relational database system that’s been around since the 1990s. Think of it as a super organized filing cabinet that:
- Stores your data in tables (like spreadsheets)
- Ensures data integrity through constraints
- Allows complex querying through SQL
- Handles multiple users and transactions simultaneously
Prerequisites & Installation
Before we jump in, you’ll need:
- Basic understanding of databases (tables, rows, columns)
- Familiarity with command line interfaces
- Some SQL knowledge helps, but we’ll cover the basics
Installation
Let’s get Postgres installed:
On Mac (using Homebrew):
brew install postgresql
brew services start postgresql
On Ubuntu/Debian:
sudo apt update
sudo apt install postgresql postgresql-contrib
sudo systemctl start postgresql
On Windows:
- Download the installer from postgresql.org
- Run the installer and follow the wizard
- Use the default port (5432)
Once installed, you can access PostgreSQL through:
- Command line using
psql
- GUI tools like pgAdmin, DBeaver, or TablePlus
Connecting to PostgreSQL
# Connect to default database
psql postgres
# Connect with specific user
psql -U yourusername
# Connect to specific database
psql -d yourdbname -U yourusername
When first installed, PostgreSQL creates a default user called “postgres”:
# Switch to postgres user (Linux)
sudo -u postgres psql
Database Creation & Management
Let’s create our first database:
-- Create a new database
CREATE DATABASE coffee_shop;
-- Connect to it
\c coffee_shop
-- List all databases
\l
Table Design & Data Types
Tables are where your data lives. Let’s create one:
CREATE TABLE products (
id SERIAL PRIMARY KEY, -- Auto-incrementing unique ID
name VARCHAR(100) NOT NULL, -- Text with max length of 100
price DECIMAL(6, 2) NOT NULL, -- Money value with 2 decimal places
description TEXT, -- Unlimited text field
in_stock BOOLEAN DEFAULT TRUE, -- Boolean with default value
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP -- Current date/time
);
Common PostgreSQL Data Types:
- Text Types: VARCHAR(n), TEXT, CHAR(n)
- Numeric Types: INTEGER, DECIMAL/NUMERIC, REAL, SERIAL
- Date/Time: DATE, TIME, TIMESTAMP, INTERVAL
- Boolean: BOOLEAN
- JSON: JSON, JSONB (binary JSON, more efficient)
- Arrays: Any data type with [] (e.g., INTEGER[], TEXT[])
Adding Data (INSERT)
Let’s populate our table:
-- Single row insert
INSERT INTO products (name, price, description)
VALUES ('Espresso', 2.50, 'Strong coffee brewed by forcing hot water through finely-ground coffee beans');
-- Multi-row insert
INSERT INTO products (name, price, description)
VALUES
('Cappuccino', 3.50, 'Espresso with steamed milk and foam'),
('Latte', 4.00, 'Espresso with lots of steamed milk and a light layer of foam'),
('Americano', 3.00, 'Espresso diluted with hot water');
Reading Data (SELECT)
Now let’s query our data:
-- Get all products
SELECT * FROM products;
-- Get specific columns
SELECT name, price FROM products;
-- Get filtered data
SELECT name, price FROM products WHERE price < 4.00;
-- Sorting data
SELECT name, price FROM products ORDER BY price DESC;
-- Limit results
SELECT name, price FROM products ORDER BY price LIMIT 2;
-- Count results
SELECT COUNT(*) FROM products;
Updating Data (UPDATE)
Need to change something?
-- Update a single record
UPDATE products
SET price = 2.75
WHERE name = 'Espresso';
-- Update multiple columns
UPDATE products
SET price = price * 1.1, description = description || ' (10% price increase applied)'
WHERE price < 3.00;
-- The || operator concatenates strings in PostgreSQL
Deleting Data (DELETE)
Removing records:
-- Delete specific records
DELETE FROM products WHERE name = 'Americano';
-- Delete all records (careful!)
DELETE FROM products;
Table Relationships
Databases really shine when linking related data. Let’s add an orders table:
-- Create customers table
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
-- Create orders table with relationships
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id), -- Foreign key
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total_amount DECIMAL(8, 2) NOT NULL
);
-- Create order items table (for many-to-many relationship)
CREATE TABLE order_items (
order_id INTEGER REFERENCES orders(id),
product_id INTEGER REFERENCES products(id),
quantity INTEGER NOT NULL,
unit_price DECIMAL(6, 2) NOT NULL,
PRIMARY KEY (order_id, product_id) -- Composite primary key
);
This creates a relationship structure:
flowchart TD A[customers] --> |has many| B[orders] B --> |contains many| C[order_items] D[products] --> |used in many| C
Let’s insert some relational data:
-- Add customers
INSERT INTO customers (name, email)
VALUES ('John Doe', 'john@example.com'), ('Jane Smith', 'jane@example.com');
-- Add an order
INSERT INTO orders (customer_id, total_amount)
VALUES (1, 7.50); -- John's order
-- Add order items
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES
(1, 1, 2, 2.75), -- 2 Espressos
(1, 2, 1, 3.50); -- 1 Cappuccino
Querying Related Data with JOINs
JOINs connect tables in queries:
-- Inner join (only matching records)
SELECT o.id AS order_id, c.name AS customer, o.order_date, o.total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.id;
-- Left join (all orders, even if no matching customer)
SELECT o.id AS order_id, c.name AS customer
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.id;
Getting order details with multiple joins:
SELECT
c.name AS customer,
o.id AS order_id,
p.name AS product,
oi.quantity,
oi.unit_price,
(oi.quantity * oi.unit_price) AS item_total
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
ORDER BY o.id, p.name;
Indexes for Performance
As tables grow, queries slow down. Indexes speed them up:
-- Create an index on a column you search frequently
CREATE INDEX idx_products_name ON products(name);
-- Create a unique index
CREATE UNIQUE INDEX idx_customers_email ON customers(email);
-- Create a multi-column index
CREATE INDEX idx_order_items_lookup ON order_items(order_id, product_id);
When to use indexes:
- Columns in WHERE clauses
- JOIN columns
- ORDER BY columns
- Unique constraints
But remember: indexes speed up reads but slow down writes!
Schema Design Visualization
Here’s our coffee shop database schema:
erDiagram CUSTOMERS ||--o{ ORDERS : places ORDERS ||--|{ ORDER_ITEMS : contains PRODUCTS ||--o{ ORDER_ITEMS : "ordered in" CUSTOMERS { int id PK varchar name varchar email UK } ORDERS { int id PK int customer_id FK timestamp order_date decimal total_amount } ORDER_ITEMS { int order_id PK,FK int product_id PK,FK int quantity decimal unit_price } PRODUCTS { int id PK varchar name decimal price text description boolean in_stock timestamp created_at }
Advanced Querying Techniques
PostgreSQL offers powerful querying tools:
Aggregation Functions
-- Get average price of products
SELECT AVG(price) AS average_price FROM products;
-- Get price statistics
SELECT
MIN(price) AS lowest,
MAX(price) AS highest,
AVG(price) AS average,
SUM(price) AS total
FROM products;
-- Group by with counts
SELECT
in_stock,
COUNT(*) AS product_count,
AVG(price) AS average_price
FROM products
GROUP BY in_stock;
-- Having clause (like WHERE but for groups)
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 1; -- Only customers with multiple orders
Subqueries
-- Subquery in WHERE clause
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
-- Subquery in FROM clause
SELECT customer_id, order_count
FROM (
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
) AS order_counts
WHERE order_count > 5;
Common Table Expressions (CTEs)
These are like temporary named result sets:
-- Calculate revenue per product
WITH product_revenue AS (
SELECT
p.id,
p.name,
SUM(oi.quantity * oi.unit_price) AS revenue
FROM products p
JOIN order_items oi ON p.id = oi.product_id
GROUP BY p.id, p.name
)
SELECT
name,
revenue,
revenue / (SELECT SUM(revenue) FROM product_revenue) * 100 AS percentage
FROM product_revenue
ORDER BY revenue DESC;
Basic Database Administration
User Management
-- Create a new user
CREATE USER barista WITH PASSWORD 'espresso123';
-- Grant privileges
GRANT SELECT, INSERT ON products TO barista;
GRANT SELECT, INSERT, UPDATE ON orders, order_items TO barista;
-- Make a user a superuser (admin)
ALTER USER barista WITH SUPERUSER;
-- Remove privileges
REVOKE INSERT ON products FROM barista;
Backup & Restore
# Create a backup
pg_dump -U postgres coffee_shop > coffee_shop_backup.sql
# Restore from backup
psql -U postgres coffee_shop < coffee_shop_backup.sql
# Backup a specific table
pg_dump -U postgres -t products coffee_shop > products_backup.sql
Performance Tips
Use EXPLAIN ANALYZE to see query plans:
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 1;
Keep your database maintained:
-- Cleanup and optimize a table VACUUM ANALYZE products;
Use appropriate data types (INT for IDs, not VARCHAR)
Limit results with LIMIT when possible
Use prepared statements for repeated queries
The 15% You’ll Explore on Your Own
Here’s what we haven’t covered that you might want to explore next:
- Transactions & ACID properties - For maintaining data integrity during complex operations
- Window functions - For advanced analytical queries
- Full-text search - PostgreSQL has powerful text search capabilities
- JSON functionality - Store and query JSON data natively
- Triggers & stored procedures - For automating database actions
- Views & materialized views - For simplifying complex queries
- Partitioning - Splitting large tables for better performance
- Replication & high availability - For database redundancy
- Connection pooling - For managing many concurrent connections
- Extensions - Like PostGIS for geospatial data or TimescaleDB for time-series
Summary
You now know the 85% of PostgreSQL you’ll use daily:
- Creating and managing databases and tables
- CRUD operations (Create, Read, Update, Delete)
- Relationships between tables and JOINs
- Indexing for performance
- Basic administration
The remaining 15% are more specialized features you can explore as you need them. PostgreSQL has excellent documentation at postgresql.org for when you’re ready to dive deeper!