A Very Gentle Introduction to SQL
SQL (pronounced “ess cue ell,” “sequel,” or “squeal”) stands for Structured Query Language, the international standard for getting data sets out of databases.
General Syntax
In general, SQL queries take the form of:
WHERE [condition(s) you’re looking for]
For example, to get a list of the chart number, first name, and last name of all active patients who live in the 90210 zip code, we’d use the SQL query:
To find out which patient(s) have the home phone number 936-273-3859, we’d write the query:
To find a list of Medicaid babies born since 1/1/2016, we’d write the query:
(REGISTER.SICKVNO = 1) AND (REGISTER.BIRTHDAT > “1/1/2016”)
You can use Boolean operators (OR, AND, NOT) in the WHERE conditions. For example, to find a list of active patients whose PCP is doctor CLR with primary, secondary, or tertiary insurance BLU:
To see more examples of search criteria that can be used in OP Reports, see Module 4, Lesson 2, “SEARCH tab.”
It’s generally better to start with one or two conditions (WHERE clauses) and slowly add more in one at a time, rather than piling on all the WHERE conditions at first. Otherwise, it’s easy to put so many conditions on your search that one will be overly restrictive, and nothing will turn up in your results:
( SINGLEGUY.HEIGHT > 72) AND ( SINGLEGUY.INCOME > 100000) AND (SINGLEGUY.EYES = ‘BLUE’) AND (SINGLEGUY.MOM_DISTANCE > 200) AND (SINGLEGUY.HOME <> “BASEMENT”) AND (SINGLEGUY.ATHLETIC = ‘Y’) AND (SINGLEGUY.KIDS <= 3) AND (SINGLEGUY.PSYCHO_EX = ‘N’) AND (SINGLEGUY.WILLING_TO_COMMIT = ‘Y’) …
Sorting
When you get a long list, it’s often helpful to sort the rows. Adding an ORDER BY clause will sort the query results. For example, to sort our Beverly Hills query by last name then first name, we’d want…
ORDER BY REGISTER.LNAME, REGISTER.FNAME
But to sort it by patient chart number, from smallest to largest, we’d want…
ORDER BY REGISTER.PATNO
Joins
For all but the simplest reports, we will want information from more than one table – this procedure is called a join. In order to join two tables in a logical fashion, we have to find a common “unique index” – that is, a field common to both tables.
Let’s say we want to print a report that shows a patient’s chart number, first and last name, and the full name of their PCP. This will require data from both the REGISTER and the STAFF1 tables:
REGISTER
|
STAFF1
|
So we could Join these two tables to get something like:
PATNO | FNAME | LNAME | STAFFNAME |
101 |
Mary |
Smith |
Michaela Quinn |
102 |
Mickey |
Maus |
Ludwig Von Drake |
103 |
Daphne |
Duq |
Otto Scratchansniff |
104 |
Polka |
Hontas |
Ludwig Von Drake |
To do the join, we use an SQL command like:
INNER JOIN STAFF1 ON ( STAFF1.STAFFINIT = REGISTER.DOCTOR)
The equal sign in the phrase STAFF1.STAFFINIT = REGISTER.DOCTOR shows that we’re looking for exact matches between the DOCTOR field in REGISTER and the STAFFINIT field in STAFF1. To be precise, the tables STAFF1 and REGISTER are now “inner joined,” and STAFFINIT and DOCTOR are the two “join fields” that the tables are “joined on.”
A join doesn’t permanently change the structure of the tables – it’s a “read-only” operation on the table.
Types of joins: inner, outer, left, right
There are 4 basic kinds of join: Inner Join, Left Outer Join, Right Outer Join, and Full Outer Join. To illustrate them, let’s look at two pretend tables:
DEMOGRAPHICS
|
RECREATION
|
An inner join matches only rows that appear in both tables. If we “inner join” the tables DEMOGRAPHICS and RECREATION on the fields DEMOGRAPHICS.PATNO = RECREATION.PATNO, we get:
INNER JOIN
|
The phrase “inner join” means that, only when an exact match is found in both fields, the two rows are joined and appear in the resulting table. If one row is blank in the join field (or no match is found in either other table), then neither row will appear in the resulting table. So people who don’t have a favorite sport declared (like 103 and 105), or records in which a sport has been entered but no name (106), will disappear out of the results. In a Venn diagram, inner join is like “A intersect B.”
Outer joins include at least one of the complete tables, plus whatever that table intersects in the other table.
For example, left outer join means you want all the rows in the first (left) table to show up, whether or not there’s a match for it in the second (right) table.
LEFT OUTER JOIN
|
A right outer join, like a left outer join, includes all of one table and whatever matches (if any) it has in the other table. A right outer join includes everything in the second (right) table, and whatever matches in the first (left) table:
RIGHT OUTER JOIN
|
Full Outer Join, the final possibility for a table join, means you want all the rows in both tables to appear, matching where applicable.
FULL OUTER JOIN
|
In general, full outer joins may wreak havoc because they tend to be sloppy “total dumps” of two tables. Use them only when a narrower join won’t work.
Choosing join fields can be tricky
Be careful that you select the right fields to join on. If you didn’t look carefully, you might accidentally join two columns that look like they actually point to the same type of data, but don’t:
This join looks for matches between the PATNO field and the STAFFID field. This would give the following result:
PATNO | FNAME | LNAME | STAFFNAME |
104 |
Polka |
Hontas |
Michaela Quinn |
The one row that does show up is matched wrong, of course: It’s totally wrong to assume that patient #104 would have doctor #104 as her PCP.
However, in OP Reports, the Query Wizard (see the next module for TABLE) often does just that. The Query Wizard “suggests” table joins, based on what it perceives to be similar/identical fields. In this case, it would see that both PATNO and STAFFID are lists of numbers starting at 101, and would assume they’re the same data series. Sometimes the suggested joins are correct; other times, not so. Accept the suggested joins only after critical evaluation.
Joins sometimes result in one-to-one results, one-to-many results, or many-to-many results. In the latter two cases, you will have much longer reports, sometimes with multiple seemingly “duplicate” line items. You will need to either account for this in your report design, or leave the duplicates (which can be removed later using Excel’s Remove Duplicates tool.)
In a one-to-one relationship, we expect each row from the table on the left to have one, and only one, proper match to the table on the right, and vice versa. Each patient in REGISTER can only have one birth history, and each birth history recorded in MEDHIST is only ascribed to one patient in REGISTER:
REGISTER
|
MEDHIST
|
In a one-to-many (also called a “many-to-one”) relationship, a row in one of the tables will have multiple matching rows in the other. In the case of associating PCPs with patients, any single (“one”) PCP might have hundreds (“many”) of patients on his panel. However, any given patient only has one PCP at a time:
REGISTER
|
STAFF1
|
In a many-to-many relationship, multiple rows in one table will match to multiple rows in the other table. ENC_NOTE contains all the encounter note data in OP, where PATNO is the patient’s chart number, NOTE_ID is a unique progress note number, and NOTE_DATE is the date of service for the encounter note. It’s reasonable to expect that a patient may have several different entries in ENC_NOTE, one for each visit to the office, as well as many different billing entries in ARCHIVE_TRANSACTIONS.
ENC_NOTE
|
ARCHIVE_TRANSACTIONS
|
Making a join on PATNO alone will result in a “cross-multiplied’ table. That is, every PATNO = 101 row on the left table will match and join every PATNO = 101 on the right table. Since there are three such rows in the left table, and four such rows in the right table, the resulting join will have twelve rows (3 x 4):
ENC_NOTE. PATNO |
ENC_NOTE.NOTE_ID |
ENC_NOTE. NOTEDATE |
ARCHIVE_ TRANS ACTIONS. DATE1 |
ARCHIVE_ TRANS ACTIONS. CPTCODE |
ARCHIVE_ TRANS ACTIONS. CHARGE |
101 |
4385 |
04/01/2014 |
04/01/2014 |
99214 |
120.00 |
101 |
4385 |
04/01/2014 |
04/01/2014 |
96110 |
50.00 |
101 |
4385 |
04/01/2014 |
04/18/2014 |
99213 |
85.00 |
101 |
4385 |
04/01/2014 |
04/29/2014 |
99213 |
85.00 |
101 |
4496 |
04/18/2014 |
04/01/2014 |
99214 |
120.00 |
101 |
4496 |
04/18/2014 |
04/01/2014 |
96110 |
50.00 |
101 |
4496 |
04/18/2014 |
04/18/2014 |
99213 |
85.00 |
101 |
4496 |
04/18/2014 |
04/29/2014 |
99213 |
85.00 |
101 |
4690 |
04/29/2014 |
04/01/2014 |
99214 |
120.00 |
101 |
4690 |
04/29/2014 |
04/01/2014 |
96110 |
50.00 |
101 |
4690 |
04/29/2014 |
04/18/2014 |
99213 |
85.00 |
101 |
4690 |
04/29/2014 |
04/29/2014 |
99213 |
85.00 |
This contains way too much duplication to have anything useful. It’s only meaningful if we match up and join on not only the patient numbers, but also the date of service. If re-define the inner join with both
as well as
…we will eliminate some of the redundant rows, and see only the CPT codes and charges that correspond to that encounter note.
ENC_NOTE. PATNO |
ENC_NOTE. NOTE_ID |
ENC_NOTE. NOTEDATE |
ARCHIVE_ TRANSACTIONS. DATE1 |
ARCHIVE_ TRANSACTIONS. CPTCODE |
ARCHIVE_ TRANSACTIONS. CHARGE |
101 |
4385 |
04/01/2014 |
04/01/2014 |
99214 |
120.00 |
101 |
4385 |
04/01/2014 |
04/01/2014 |
96110 |
50.00 |
101 |
4496 |
04/18/2014 |
04/18/2014 |
99213 |
85.00 |
101 |
4690 |
04/29/2014 |
04/29/2014 |
99213 |
85.00 |
What may begin as a many-to-many relationship on a single join may become a one-to-many (or even a one-to-one) relationship as multiple join conditions are added.
In addition to the above-described inner join, another way to join tables is he outer join. The outer join allows one table to show all the applicable fields, whether or not they have matches in the other table.
For example, let’s say we want to look at the birth history of all patients in REGISTER. We could join the following two tables on REGISTER.PATNO = MEDHIST.PATNO:
REGISTER
|
MEDHIST
|
…but an inner join like
FROM REGISTER
INNER JOIN MEDHIST ON (MEDHIST.PATNO = REGISTER.PATNO)
ORDER BY REGISTER.PATNO
would give us the following result:
REGISTER.PATNO | REGISTER.FNAME | REGISTER.LNAME | MEDHIST.DELIVERY | MEDHIST.FEEDING | MEDHIST.B_WEIGHTKG |
101 |
Mary |
Smith |
NSVD |
BREAST |
3.53 |
104 |
Polka |
Hontas |
NSVD |
BREAST |
3.12 |
When patients in the REGISTER table don’t have a corresponding row in the MEDHIST table (maybe they’re cross- cover patients for whom we never obtained a birth history), they don’t appear in an inner join. However, an outer join like:
FROM REGISTER
LEFT OUTER JOIN MEDHIST ON (MEDHIST.PATNO = REGISTER.PATNO)
ORDER BY REGISTER.PATNO
will give us:
REGISTER.PATNO | REGISTER.FNAME | REGISTER.LNAME | MEDHIST.DELIVERY | MEDHIST.FEEDING | MEDHIST.B_WEIGHTKG |
101 |
Mary |
Smith |
NSVD |
BREAST |
3.53 |
102 |
Mickey |
Maus |
|||
103 |
Daphne |
Duq |
|||
104 |
Polka |
Hontas |
NSVD |
BREAST |
3.12 |
105 |
Dollie |
Parson |
The two most commonly used kinds of outer joins are left outer join (used above) and its complement, right outer join. “Left” and “right” refer to the table which should be included in its entirety, even if there is no match in the other table. The first table referred to in the code is by convention always the “left” table and the table referred to second is the “right” table…
In this case, REGISTER is the “left” table and ARCHIVE_TRANSACTIONS is the “right” table. Thus in this query, rows in REGISTER will be returned even if they have no matches in ARCHIVE_TRANSACTIONS.
Switching the order of the tables will give the same result if the join is a Right Outer Join:
Note that when SELECTing fields from two tables you’re joining, you select fields from both tables in the list of fields. That is,
Special SQL Calculations and Transformations
Firebird SQL (the dialect of SQL that OP uses) allows special calculations on fields you’re selectin from the database. As we’ve seen, a patient’s age on a particular date of service is the difference (in years or months) between the DOS and the patient’s DOB.
Let’s say you want to know how old patient #101 was on the dates of service we’ve seen him.
DATEDIFF(day, REGISTER.BIRTHDAT, ARCHIVE_TRANSACTIONS.DATE1) gives the difference in days between the child’s date of birth and a date of service in claims. To get age in years, we divide that value by the number of days in an average year (365.24).
ARCHIVE_TRANSACTIONS.CPTCODE NOT IN (‘1’, ‘2’) eliminates dates that payments were made on patient #101’s account. This leaves only charge dates (i.e. dates of service) in the list.
For a more complete example of DATEDIFF used in a full report, see the Example Reports tutorial.
For other kinds of calculations that can be used in OP Reports, see Module 4, Part 2, “CALCS”.