THE SQL Myth

Sometimes people make an assumption and then state it as a fact. Those that hear it, pass it on, and before you know it the mistaken assumption takes on a life of its own. This is what has happened with the computer language named SQL (pronounced ess-que-ell).

Where did SQL come from?

How did it become the query language of choice for relational databases?

Why are so many people confused about its name and how to pronounce it?

The relational database model was invented by Edgar F. “Ted” Codd, and published in a paper in 1970. Codd was working for IBM at the time, but IBM did not initially commercialize the idea. A group at IBM created a language to work with an internal relational effort. The language, SEQUEL, was an acronym for Structured English QUEry Language. The Structured English part related to the fact that statements in the language were very similar to ordinary English language statements, but were more highly structured. The QUEry Language part related to the fact that one of the things you could do with the language was to ask questions of a database. The language also could be used to create and destroy databases, as well as provide security.

IBM was finally forced into releasing a relation product when rival Relational Software, now Oracle, released one first. By that time, the name SEQUEL had already been trademarked by another company, for an unrelated product. IBM decided to drop the vowels in the name and go with SQL. This caused a couple of points of confusion.

  1. The people at IBM who were accustomed to calling the language SEQUEL continued to do so, causing that pronunciation to proliferate throughout the community, even though the name had been officially changed to SQL, pronounced ess-que-ell.
  2. People who had not been part of the original project assumed that SQL was an acronym, and that the letters SQL stood for Structured Query Language. This was an easy mistake to make, since that is what the letters had stood for in the original, unofficial acronym for the language.

The problem with calling SQL an acronym for Structured Query Language is that SQL is not a structured language, as computer scientists understand the term “structured language.” In the early days of computers, all computer languages were unstructured. By that I mean that it was possible for execution to jump from one place in the code to any other place. In many of those languages this was accomplished with a GOTO command. The result was often code that was almost impossible to understand or debug, derisively called “spaghetti code.”

In the 1970s, to overcome this problem, a discipline called structured programming came into use. It forbids the indiscriminate jumping from one place in a program to another. The GOTO command was banished in structured languages and programs became much more reliable.

SEQUEL, which was invented at about the same time that structured languages were first talked about, was not a structured language. It included GOTO functionality. SQL, when it was officially released to the world, retained that functionality and still does. It is thus not a structured language. However, as its original name implies, its syntax is a kind of structured English.

So what is to be done? People who have been pronouncing SQL as “sequel” for their entire careers will probably continue to do so. However, the difference between a structured and an unstructured language is important. We don’t want to return to the bad old days of spaghetti code. It’s important that we call a thing what it is and that we don’t call a thing what it isn’t. SQL is not a structured language, query or otherwise, and should not be called one.

For an easy introduction to SQL, try my short course here: http://pioneer-academy1.teachable.com

Creating a Database that Meets User Needs 3: The Requirements Phase

Computer Storage

After the scope of a database development project has been established during the Definition Phase, you need to find out what the stakeholders in the project need it to do for them. That is the job of the Requirements Phase. In the Definition Phase, you talk with the client. This is the person who has the authority to hire you or, if you are already an employee, assign you to this development task. This person is not, however, the only one with an interest in the project. In all probability, someone other than the client will use the system on a daily basis. Even more people may depend on the results generated by the system. It is important to find out what those people need and what they prefer because your primary client may not have a complete understanding of what would serve them best.

The amount of work you must do in the Requirements Phase depends on the client. If can be quick and easy if your are dealing with a client who has prior experience with similar database development projects. Such a client has a clear idea of what he wants and, equally important, what is feasible within the time and budget constraints that apply.

On the other hand, this phase can be difficult and drawn-out if the client has no experience with this kind of development, only a vague idea of what he wants, and an even vaguer idea of what can reasonably be done within the allotted time and budget.

Aside from your primary client, other stakeholders in the project, such as various users, managers, executives, and board members, also have ideas of what they need. These ideas often conflict with each other. Your job at that point is to come up with a set of requirements that everyone can agree upon. This will probably not meet everyone’s desires completely. It will be a compromise between conflicting desires, but will be the solution that gives the most important functions to the people who need them.

The Statement of Requirements

The Statement of Requirements is an explicit statement of the database application’s deliverables, including its display, update, and control mechanisms. It will answer such questions as:

  • What will the display look like? How will components be arranged? What will be the color scheme?
  • What items will need to be updated, and how will that be done?
  • How will users navigate between screens?
  • Will selections be made by key depressions, and if so, which  keys will do what?
  • Will operations be initiated by mouse clicks? If so, which operations?
  • What will the maximum acceptable response time to a query be?

The Statement of Requirements must be as detailed as possible because it is essentially a contract between you and your client. You are agreeing on exactly what will be delivered and when it will be delivered. To seal the arrangement, bot you and your client should sign the Statement of Requirements, signifying agreement on what you will be responsible for delivering.

Sumarizing, in the Requirements Phase you must:

  • Interview typical members of all classes of stakeholders in the project.
  • Provide leadership in getting stakeholders to agree on what is needed.
  • Create the Statement of Requirements, which describes in detail what the system will look like and what it will do.
  • Obtain client approval of the Statement of Requirements, indicated by a signature and date.

How a DBMS Organizes Database Data for Speedy Retrieval

Computer Storage

It’s important that database data be retrievable in a reasonable amount of time. This is particularly an issue for very large databases. The speed with which information is retrieved from a database depends on a number of factors. The size of the database and the speed of the hardware it is running on are obvious factors. Perhaps most critical, however, is the method used to access table data, which depends on the way the data is structured on the storage medium.

How quickly a system can retrieve desired information depends on the speed of the device that stores it. Different storage devices have a wide range of speeds, spanning many orders of magnitude. For fast retrievals, the information you want should reside on the fastest devices. Because it is difficult to predict which data items will be needed next, you can’t always make sure the data you are going to want next will be contained in the fastest storage device. Some storage allocation algorithms are nonetheless quite effective at making such predictions.

There is a hierarchy of storage types, ranging from the fastest to the slowest. In general, the faster a storage device is, the smaller its capacity. As a consequence, it is generally not possible to hold a large database entirely in the fastest available storage. The next best thing is to store that subset of the database that is most likely to be needed soon in the faster memory. If this is done properly, the overall performance of the system will be almost as fast as if the entire memory was as fast as the fastest component of it. Here are the components of a typical memory system, starting with the fastest part:

Registers: The registers in a computer system are the fastest form of storage. They are integrated into the processor chip, which means they are implemented with the fastest technology and the delay for transfers between the processing unit and the registers is minimal. It is not feasible to store any portion of a database in the registers, which are limited in number and in size. Instead, registers hold the operands that the processor is currently working on.

L1 cache: Level 1 cache is typically also located in the processor chip, but is not as intimately integrated with the processor as are the registers. It is the fastest form of storage that can store a significant fraction of a database.

L2 cache: Level 2 cache is generally located on a separate chip from the processor. It has greater capacity and is usually somewhat slower than the L1 cache.

Hard disk: Hard disk storage has orders of magnitude more capacity than does cache and is orders of magnitude slower. This is where databases are stored. Registers, L1 cache, and L2 cache are all volatile forms of memory. The data is lost when power is removed. Hard disk storage, on the other hand, is non-volatile. The data is retained even when the system is turned off. Because hard disk systems can hold a large database and retain it when power is off or interrupted, such systems are the normal home of all databases.

Offline storage: It is not necessary to have immediate access to databases that are not in active use. They can be retained on storage media that are slower than hard drives. A sequential storage medium such as magnetic tape is fine for such use. Data access is exceedingly slow, but acceptable for data that is rarely if ever needed. Huge quantities of data can be stored on tape. Tape is the ideal home for archives of obsolete data that nevertheless need to be retained against the day when they might be called upon again.

 

Business users who just want to retrieve the information they need from a database need not concern themselves with how the DBMS does it. However, it is good to know what is going on “under the covers” to return to you the results you want in a timely fashion.

To receive my free mini-booklet on protecting yourself from SQL injection attacks, or to comment on this article, please fill in the form below. Thanks!

SQL Cheat Sheet

To receive my free SQL Cheat Sheet, please enter your name and email address in the Subscribe fields to the left.

Thanks,

Allen Taylor

 

Looking for a High Paying Job? There are a Lot of Openings Right Now.

Computer Storage

 

I’ve spent my career developing, using, and writing about databases. Databases, and the data they contain are the most valuable assets that organizations large and small possess. Making use of that data, curating it, and maintaining it are high value jobs in those organization. There  are a lot of different database management systems out there, some proprietary and others open source. They all have one thing in common, however. They rely on the SQL data language to talk to the database and retrieve useful information from it. Right now, thousands of job openings that require SQL knowledge are posted in the Web every day. People who can fill those openings are much in demand, which means salaries, already high, are on the way up.

Back in 1995, when I wrote the first edition of SQL For Dummies, recognition of the importance of SQL was just starting to emerge from the inner sanctum of corporate data centers. Now SQL is universal and contained in anything that stores data in a structured manner. I am working on a video course that will provide an easy onramp to SQL proficiency. I will keep you updated on what will be in the course and when I plan to roll it out. In the meantime, if you have specific areas that you will like to see covered in the course, please contact me at allen.taylor@ieee.org and let me know.

Web Sites Susceptible to Hacks

One of my subscribers, an experienced web wrangler, had one of his WordPress sites reduced to guacamole by malicious hackers. This, it turns out, is not that uncommon. There are vulnerabilities that are well known to the hacker community, in thousands of plug-ins that are used with WordPress. Here’s an article that exposes the extent of the problem.

http://www.securityweek.com/over-8800-wordpress-plugins-have-flaws-study?_scpsug=crawled_97466_c3dbda40-c2ea-11e6-e02f-90b11c40440d#_scpsug=crawled_97466_c3dbda40-c2ea-11e6-e02f-90b11c40440d