This was a very well organized and well presented talk by Selene. I'll also be attending here Advanced SQL talk later, but here's my notes from this presentation.

Databases: Collection of Tables, rows, columns, look like spread sheets, unique identifers, etc..

Relational Databases (RDBMS) - determine what the db is going to be, what the data might be, visualize columns and think about their relationship, helps to build tables.

For example, a product db might have product name, category, when they placed order, customer info, etc.... Example shows a flat data file with all info in one big table, but elements are repeated in multiple rows, duplication of data, no referential data, and data can't be queried until its inserted. A form should be able display all the product names, (a little fast here...) How to get rid of data duplication, data could be misspelled in different instances, how do you know which values are the good values, and might have multiple prices so how do you know which one is the current price.

Selene Bainum - Database Design

So group related data, unique items, and then create tables, and determine the best datatype for each type of info. Two table option, one for categories (UID, cat name) and one for products (uid, name, price, uid of cat). Why is two table method better? Easier to update one piece of data which is reference multiple times such as the products table which will be pulling from categories. Easier to query category table for names rather than query product table for distinct category.

Determine parent child relationship, and here category is parent. Also define Customer (uid, name, email), Orders (uid, order date, Cust id), and Order Details (order uid, prod uid, prod quantity, prod price paid) tables. Here you want to capture the price paid at time of purchase not current price in case that changes before the order is shipped.

Table and column considerations: Naming restrictions, datatypes, null values. Naming - aovid spaces and funky chars in table/column names, follow good variable naming practices; avoid reserved words (don't name DATE 'date', NAME 'name', etc...); might want to use compound words to avoid that... Then there are naming conventions: Be consistent, Table_Name/Column_Name, tableName/columnName, etc... Should table names be plural? Product or Products? Just be consistent. If its a reference table, try adding _ref suffix like TableName_Ref, or if a join table try concat the names like Table1_Table2..

Data types: know your data, whats the type? Restrict type. Determine sorting and formatting, i.e. a date field should be sorted by date not treated as text and sorted by text. Increase efficiency by limiting field size. Common data types include (well, its a big table here comparing db types and how each vendor has slightly different types).

Keys: primary key and foreign keys. Primary key is unique id for row, usually UID, cannot be changed. Foreign keys located in child tables and establishes relationship to parent (such as category key in product table)

Indexes are used to help efficiency and speed, takes longer to insert and update, but improves search. A speed problem in application or queries is most likely lack of indexes on tables.

Null values different from empty strings or 0's, and want to make required fields NOT NULL.

Tables needed for example, Category_Ref, Product, Customer, CustOrder.

Category_Ref: CategoryID Int not null and is primary key, CategoryName varchar(150)

Product: ProductID, ProductName, ProductPrice, CategoryID (int, nullable, is foreign key)...

Customer: CID, FNAME, LNAME, etc...

CustOrder: CustOrderID, OrderDate, CID.

CustOrder_Product: CustOrderID, ProductID, ItemQuantity, ItemPrice

(After each table is created in SQL, an ALTER TABLE is run to designate add primary key): ALTER TABLE CustOrder_Product AND PRIMARY KEY (CustOrderID)

If you use char type the field will be padded with spaces to reach the full column size, but varchar doesn't do this.

Table Relationships: Solidify parent/child relations ships, referential integrity, prevents parent record being deleted while child record exists (parent will not be deleted), prevents child from being inserted without a parent record (child will not be created). Prevents "phantom records".

Creating relationships using SQL, with SQL Enterprise usually, but scripts uses ALTER TABLE, like ALTER TABLE CustOrder ADD FOREIGH KEY (CustomerID) REFERENCES Customer.

This design allows for growth, permits changing category name, adding columns, add new tables.... all very easy to do this way.

Completed design: Each table has UID/PK, store data for only one type of entity, avoid nullable columns, avoid repeating data.... Leads to a normalized database!

Selene recommends "Erwin" (?) tool to help create entity relationship diagrams.

For not null fields, add UI checks like javascript to validate data input, and check again on the server.

Selene recommends two books by Ray Camden (CFMX Dev Handbook, Mastering CFMX) and Ben's Teach Yourself SQL. Tools include ERWin (http://ca.com/) and CASE Studio 2. She's warning a little about Visio and recommends ERWin instead.

Now for code examploes.... (better to see Selene's slides for this)...

A few notes... Add parent first, then child, but delete in opposite order.... She prefers to join in the FROM clause... Use group attribute to group on CID and an inner cfoutput to group on Order ID.... Selene had a lot more offer here but I couldn't keep up... See her site http://www.webtricks.com/ for slides (cfunited icon on site).