LG202: IoT Cloud Server Project - Database Design
This is a project all about relationships, relationships between users, organisations, projects, devices, device roles… A relational database is a logical place to store this data, we can create relationships between the various entities and define indexes to ensure lightning-fast retrieval of data.
When you search for papers on indexing and relationships on Google Scholar the dates of the papers, many from the ’80s and ’90s, indicate this is now a mature and well-understood technology. I have been using it for over 20 years.
Database management and optimization was once a dark art, now, SQL Azure will use AI to correct any mistakes we make to ensure optimal performance. It adjusts indexes based on the way we query the data, so constantly adapts to not only the amount of data but our use of it… A very skilled and valuable member of our team.
https://docs.microsoft.com/en-us/azure/azure-sql/database/database-advisor-implement-performance-recommendationsSecurity First, Second and Third!
My approach to database use in web applications is a security-first one, born out of fear! That fear is twofold… First, losing business-critical data which is just so embarrassing, secondly, making users personal data available to third parties, which can be as serious as putting lives at risk. Here’s a couple of scenarios.
Scenario 1
Imagine you have a website that displays order information on a page. It passes a parameter entered into the browser query string to SQL Command in the server-side code.
Let’s say the request looks like /orders/details?orderid=3 which is handled by the following code…
using (SqlConnection connnection = new SqlConnection(ConnString))
{
string sql = "SELECT * FROM tblOrders WHERE OrderID = " + Request[“orderid”];
using (SqlCommand cmd = new SqlCommand(sql))
{
cmd.Connection = connnection;
cmd.Connection.Open();
var result = cmd.Execute();
}
}
Ok, that works, it returns the requested Order from the database, but what happens if you change the string from 3 to 3; DROP TABLE tblOrders; very bad stuff… that’s what happens!
This kind of attack is known as SQL Injection, and is a well-known vulnerability so allowing its exploitation would be unforgivably lax. As a rule, we don’t want our web applications interacting with tables or creating any SQL Commands directly.
We will create a database user for the WebApp connection that can’t access anything except specific Stored Procedures which will only accept specific strongly-typed Parameters.
This is a good start, but if someone gained access to the Connection String, they can still access your data, we have some data we just can’t afford to let people see, not even us!
Scenario 2
If you have signed up to this service to access or share sensor data, there’s a 59% chance you’ve used a password you use on other services. [link] Maybe that’s how the hacker gained access to our data, maybe the developer used the same password as he’d used on another compromised site?
This is the nightmare scenario, a guy sitting in his mum’s basement now has an entire copy of our database. It has 100,000 users, 100,000 passwords linked to 100,000 email addresses. This isn’t going to end well for 59% of you!
We have two defences against this, Encryption and Hashing. Encryption for information we need to retrieve, such as email addresses. Hashing for information we just need to prove was entered correctly such as passwords. SQL Server can perform both these tasks for us, but we are going to do them in the WebApp so a data breach would require two points of failure, in fact, we’ll then store the keys required to perform these tasks elsewhere, adding a third… born out of fear!
WebSocket Database
Now we’ve done enough to ensure we aren’t the source of the next big data breach, let’s take a look at the database schema for this project.
Using our user stories and learning objectives for this module, I can start to sketch out the entity types I need to define in my database. Users, Devices, Projects are our starting point, I’m going to add Organisations to that too, just to add another level of filtering. How we think of the relationships between items defines how we join them together.
A User is responsible for a Device, and a Project is administered by an Organisation, so they can be linked directly. Users may have multiple devices and may have different roles within different projects. We will use link tables to join these together.
Just nine Tables… that’s pretty good… and none of our tables is too long, which is often a sign more normalisation is required.
General Database Thoughts
I’ve taken a couple of shortcuts that we should discuss, I don’t have time to develop a more complex database schema for this module, the final solution will require a little more work in the following areas.User Types – I’ve used UserTypeID in the link tables linkOrganisationUsers and linkProjectUsers, this integer will define the level of access the user has. I don’t have a tblUserTypes table so in my Class Library in C# I just have an enumerator with a handful of values. This works perfectly well, but if at some point I wanted to query the database directly for reporting, anyone reading it would have no idea what the various UserTypeID’s represented.
Locations – I have Location and LocationDescription in two of my tables, tblProjects and tblDevices. This is a shortcut so the dashboard pages I develop for this module can show where Projects are based and where Devices are located. More work is required here, devices move, how do we want to store location information, postcode, coordinates?
Address – In tblOrganisations I have address fields. Anything that may change, and you may want to keep the previous information available should probably be stored in a linked table. In this case, tblAddresses with AddressFrom and AddressTo fields, also an AddressTypeID we can set to CurrentAddress or PreviousAddress so we can always query the latest address without sorting through records.
This illustrates my ‘long table’ comment. If you don’t normalise your structure you can end up with CurrentAddress, PreviousAddress1, PreviousAddress2 etc, all in your main table… this gets messy real quick when you need about 5 fields to store each address.
A good rule of thumb is, anything you may want to filter the records by, ideally, should not be entered into a text field in a table, firstly because a ‘Table Scan’, the type of search required to retrieve this data, is very expensive, and secondly because you may end up with multiple variations of the same value, “WHERE tblDevices.Model = ‘iPad’” doesn’t retrieve entries where the user has input ‘iPad Pro’ etc. Your data is only as good as the data entry.
In our case, we should have a ‘tblModels’ table and a ‘ModelID’ field in both the ‘tblModels’ table and the ‘tblDevices’ table.
CRUD Stored Procedures
As I mentioned whilst worrying about security, we will create ‘Stored Procedures which will only accept specific strongly-typed Parameters’. These will enable us to Create, Read, Update and Delete records from tables… CRUD as it is known.
In SQL Server Management Studio I have an extension that enables me to click on a table and select Create CRUD. This generates these stored procedures for me, the only additional work is to grant access to our WebApp user and in some instances return related records from other tables to reduce the number of reads the WebApp has to perform.CREATE PROC [dbo].[crud_linkProjectUsers_select]
@ProjectUserID int
AS
SET NOCOUNT ON
SET XACT_ABORT ON
BEGIN TRAN
SELECT * from qryProjectUsers
WHERE ([ProjectUserID] = @ProjectUserID OR @ProjectUserID IS NULL)
COMMIT
GO
GRANT EXECUTE
ON OBJECT::[dbo].[crud_linkProjectUsers_select] TO [web_system]
AS [dbo];
Above is the Read part of the CRUD for returning a record from the linkProjectUsers table, but we don’t want to have to also retrieve the records from tblUsers and tblProjects, so we amend the Stored Procedure to return all three by calling the qryProjectUsers view instead of the tblProjectUsers table.
The contents of that view is below.
CREATE VIEW dbo.qryProjectUsers
AS
SELECT dbo.linkProjectUsers.ProjectUserID, dbo.linkProjectUsers.ProjectID, dbo.linkProjectUsers.UserID, dbo.linkProjectUsers.UserTypeID, dbo.linkProjectUsers.Active, dbo.tblUsers.UserID AS user_UserID, dbo.tblUsers.Firstname AS user_Firstname, dbo.tblUsers.Surname AS user_Surname, dbo.tblUsers.Mobile AS user_Mobile, dbo.tblUsers.Email AS user_Email,
dbo.tblUsers.EmailHash AS user_EmailHash, dbo.tblUsers.PasswordSalt AS user_PasswordSalt, dbo.tblUsers.PasswordHash AS user_PasswordHash, dbo.tblUsers.Verified AS user_Verified, dbo.tblUsers.Reset AS user_Reset, dbo.tblProjects.ProjectID AS project_ProjectID, dbo.tblProjects.OrganisationID AS project_OrganisationID, dbo.tblProjects.Project AS project_Project,
dbo.tblProjects.ProjectDescription AS project_ProjectDescription, dbo.tblProjects.ProjectImage AS project_ProjectImage, dbo.tblProjects.ProjectLink AS project_ProjectLink, dbo.tblProjects.Location AS project_Location, dbo.tblProjects.LocationDescription AS project_LocationDescription, dbo.tblProjects.Start AS project_Start, dbo.tblProjects.Live AS project_Live,
dbo.tblOrganisations.OrganisationID AS organisation_OrganisationID, dbo.tblOrganisations.Organisation AS organisation_Organisation, dbo.tblOrganisations.Description AS organisation_Description, dbo.tblOrganisations.Street AS organisation_Street, dbo.tblOrganisations.Town AS organisation_Town, dbo.tblOrganisations.County AS organisation_County,
dbo.tblOrganisations.Postcode AS organisation_Postcode, dbo.tblOrganisations.Country AS organisation_Country, dbo.tblOrganisations.Website AS organisation_Website, dbo.tblOrganisations.Logo AS organisation_Logo, dbo.tblOrganisations.Registered AS organisation_Registered
FROM dbo.linkProjectUsers INNER JOIN
dbo.tblProjects ON dbo.linkProjectUsers.ProjectID = dbo.tblProjects.ProjectID INNER JOIN
dbo.tblUsers ON dbo.linkProjectUsers.UserID = dbo.tblUsers.UserID INNER JOIN
dbo.tblOrganisations ON dbo.tblProjects.OrganisationID = dbo.tblOrganisations.OrganisationID
GO
ON OBJECT::[dbo].[crud_linkProjectUsers_select] TO [web_system]
AS [dbo];
There will be a few more Procedures to write, for example, to return all Devices linked to a specific user. Any interactions with the database that are not covered by our CRUD will need Stored Procedures, but this is a great starting point and has automated much of the development for us..
Rapid Application Development
We are finished developing the SQL Database, but we have two more stored procedures to look at before we move on to developing the Class Library, in fact, they will do the majority of that for us.
We know we need to Create, Read, Update, and maybe Delete records from the database, and we know what fields are in each table. By parsing our stored procedures two strings, the name of the Table we want to update and the name of the Class we want to create in C#, we can automate this process. We can create the code for both the Class file and a Test that runs on our WebApp to check all its methods.
If we need to make changes to the table structure, we just re-run the CRUD creation and the two Stored Procedures to update the Class files. As I said in the last module, I’m about 70% RAD!
The code for these two Stored Procedures is linked below, and we will be working on the Class files they create in the next article.
Why not use Entity Framework, it is designed for this?
A good question, historically I have found Entity Framework code-heavy, and slow, especially during testing on a local machine as it has to compile the entire library when you run the first query.
Entity Framework offers a level of abstraction I do not want, I want to know exactly what is happening in the database and control exactly what the WebApp has access to… again with the fear!