Granting and revoking access to a view
GRANT privilege(s)
orREVOKE privilege(s)
ON object
TO role
orFROM role
- Privileges:
SELECT
,INSERT
,UPDATE
,DELETE
, etc. - Obects: table, view, schema, etc.
- Roles: a database user or a group of database users
1 2
GRANT UPDATE ON ratings TO PUBLIC; REVOKE INSERT ON films FROM db_user;
- Privileges:
Database Roles
- Manage database access permissions
- A database role is a entity that contains information that:
- Define the role’s privileges
- Can you login?
- Can you create databases?
- Can you write to tables?
- Interact with the client authentication system
- Password
- Roles can be assigned to one or more users
- Roles are global across a database cluster installation
- Define the role’s privileges
Create a Role
- Empty Role
1
CREATE ROLE data_analyst;
- Roles with some attributes set
1 2 3 4 5
CREATE ROLE intern WITH PASSWORD 'PasswordForIntern' VALID UNTIL '2020-01-01'; CREATE ROLE admin CREATEDB; ALTER ROLE admin CREATEROLE;
- Empty Role
GRANT and REVOKE privileges from roles
1 2 3
GRANT UPDATE ON ratings TO data_analyst; REVOKE UPDATE ON ratings FROM data_analyst;
The available privileges in PostgreSQL are:
SELECT
,INSERT
,UPDATE
,DELETE
,TRUNCATE
,REFERENCES
,TRIGGER
,CREATE
,CONNECT
,TEMPORARY
,EXECUTE
, andUSAGE
Users and groups (are both roles)
- A role is an entity that can function as a user and/or a group
- User roles
- Group roles
Group Role
1
CREATE ROLE data_analyst;
User Role
1 2 3 4 5
CREATE ROLE intern WITH PASSWORD 'PasswordForIntern' VALID UNTIL '2020-01-01'; GRANT data_analyst TO alex; REVOKE data_analyst FROM alex;
Common PostgreSQL roles
| Role | Allowed Access | | pg_read_all_settings | Read all configuration variables, even those normally visible only to superusers. | | pg_read_all_stats | Read all pg_stats_* views and use various statistis related extensions, even those normally visible only to superusers. | | pg_signal_backend | Send signals to other backendds (eg: cancel query, terminate). | | More.. | More .. |
Benefits and pitfalls of roles
Benefits
- Roles live on after users are deleted
- Roles can be created before user accounts
- Save DBAs time Pitfalls
- Sometimes a role gives a specific user too much access
- You need to pay attention
Table partitioning
Why partition?
Problem: queries / updates become slower Because: e.g., indices don’t fit memory Solution: split table into smaller parts (=partitioning)
Data modeling refresher
- Conceptual Data Model
- Logical Data Model For partitioning, logical data model is the same
- Physical Data Model Partitioning is part of physical data model
Vertical Partitioning
Horizontal Partitioning
1 2 3 4 5 6 7 8 9 10 11 12
CREATE TABLE sales ( ... timestamp DATE NOT NULL ) PARTITION BY RANGE (timestamp); CREATE TABLE sales_2019_q1 PARTITION OF sales FOR VALUES FROM ('2019-01-01') TO ('2019-03-31'); ... CREATE TABLE sales_2019_q4 PARTITION OF sales FOR VALUES FROM ('2019-09-01') TO ('2019-12-31'); CREATE INDEX ON sales ('timestamp');
Pros/Cons of Horizontal Partitioning
- Pros
- Indices of heavily-used partitions fit in memory
- Move to specific medium: slower vs faster
- Used for both OLAP as OLTp
- Cons
- Partitioning existing table can be a hassle
- Some constraints can not be set
- Pros
Relation to sharding
Data Integrations
What is Data Integration?
Data Integration combines data from different sources, formats, technologiges to provide users with a translated and unified view of that data.
Choosing a Data Integration Tool
- Flexible
- Reliable
- Scalable
Picking a Database Management System (DBMS)
DBMS
- DBMS: DataBase Management System
- Create and maintain databases
- Data
- Database schema
- Database engine
- Iterface between database and end users
DBMS types
- Choice of DBMS depends on database type
- Two Types:
- SQL DMBS
- NoSQL DBMS
SQL DMBS
- Relational DataBase Management System (RDBMS)
- Based on the relational model of data
- Query Language: SQL
- Best option when:
- Data is structured and unchanging
- Data must be consistent
- Tools: Microsoft-SQL Server, PostgreSQL
NoSQL DBMS
- Less structured
- Document-centered rather than table-centered
- Data doesn’t have to fit into well-defined rows and columns
- Best option when:
- Rapid growth
- No clear schema definitions
- Large quantities of data
- Types: key-value store, document store, columnar database, graph database
NoSQL DBMS - key-value store
- Combinations of keys and values
- Key: unique identifier
- Value: anything
- Use case: managing the shopping cart for an on-line buyer
- Tools: Redis
- Combinations of keys and values
NoSQL DBMS - document store
- Similar to key-value
- Values(=documents) are structured
- Use case: content management
- Tools: mongoDB
NoSQL DBMS - columnar database
- Store data in columns
- Scalable
- Use case: big data analytics where speed is important
- Tools: Cassandra
NoSQL DMBS - graph database
- Data is interconnected and best represented as a graph
- Use case: social media data, recommendations
- Example: neo4j