Anyone building a web app with a significant amount of client-side input will understand the need for validating information sent from user requests. Data validation is a process through which user input is checked to see if it matches a specific format or lies within certain whitelisted criteria.
A quick recap of why input validation is important goes as follows:
Why not just do the data validation on the frontend and leave it at that? Users are free to turn off Javascript at any point they want. Since it's the only real method of data validation (aside from the 'required' HTML tag, which isn't of much use otherwise), not having a fallback on your server can be quite consequential.
It's also possible to validate all input manually inside your route handlers, but it doesn't scale well and opens you up to all manner of bugs. Celebrate runs on top of joi and is perhaps the best library out there for running data validation tasks on a NodeJS server.
Our project is going to rely on a few external libraries other than celebrate. These can be installed with the following command:
npm install express body-parser celebrate --save
//or
yarn add express body-parser celebrate
body-parser
allows us to read the contents of a POST or PUT request bodyexpress
starts and runs a server for uscelebrate
for data validationLet's imagine we're going to create a new app that allows users to save their favorite books for reading later. The records are meant to be entirely public. This way, anyone can log onto the website and view other users' favorite books, too.
The first iteration of our app isn't going to have any data validation, and looks like this:
//Create an express server
const express = require('express');
const bodyParser = require('body-parser');
const app =express();
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({extended: false}));
const PORT = 1234;
// Start server on port 1234
app.listen(PORT, () => console.log(`Listening on port ${PORT}`));
We have a basic server up and running. Now, let's add a few routes that allow users to enter some forms of input.
For this project, we are going to simulate a database using simple local file storage. In practice, the data is going to live in a database. The first (and rather naive) iteration of our route will look like this:
app.post("/book/new", (req, res)=> {
const jsonFile = fs.readFileSync("./books.json");
const jsonObject = JSON.parse(jsonFile);
const input = req.body.input;
jsonObject.push(input.book);
fs.writeFile('./books.json', input, (err)=> {
if (err) return res.status(400).send({message: "An error occurred"})
return res.send({message: "Added book successfully."})
})
})
All this app does for now is parse user input and push it to the 'local database.' If anything happens while saving the data to the file, the user is given a 404 error instead.
That's great for a start, but let's imagine that people have taken a real liking to your website. What happens when a malicious user decides they are going to take advantage of your open and unsecured API to spam links to their website instead? Besides, since you forgot to add client-side validation on time, a lot of fields aren't being filled out by users, but these are going to be essential for feeding to your ML system later.
You decide a hasty way to mitigate these problems would be to manually add validation to your routes.
app.post("/book/new", (req, res)=> {
//...
const input = req.body.input;
if (!input || !input.book){
return res.status(400).send({message: "Some fields are missing."})
}
jsonObject.push(input); //...
})
Again, this method works well for a relatively small object, but imagine you had the following JSON object (as borrowed from open library):
{
"publishers": [
{
"name": "Litwin Books"
}
],
"identifiers": {
"google": [
"4LQU1YwhY6kC"
],
"lccn": [
"2008054742"
],
"isbn_13": [
"9780980200447"
],
"amazon": [
"098020044X"
],
"isbn_10": [
"1234567890"
],
"oclc": [
"297222669"
],
"librarything": [
"8071257"
],
"project_gutenberg": [
"14916"
],
"goodreads": [
"6383507"
]
},
"classifications": {
"dewey_decimal_class": [
"028/.9"
],
"lc_classifications": [
"Z1003 .M58 2009"
]
},
"links": [
{
"url": "http://johnmiedema.ca",
"title": "Author's Website"
}
],
"weight": "1 grams",
"title": "Slow reading",
"url": "https://openlibrary.org/books/OL22853304M/Slow_reading",
"number_of_pages": 80,
"cover": {
"small": "https://covers.openlibrary.org/b/id/5546156-S.jpg",
"large": "https://covers.openlibrary.org/b/id/5546156-L.jpg",
"medium": "https://covers.openlibrary.org/b/id/5546156-M.jpg"
},
"subjects": [
{
"url": "https://openlibrary.org/subjects/books_and_reading",
"name": "Books and reading"
},
{
"url": "https://openlibrary.org/subjects/reading",
"name": "Reading"
}
],
"publish_date": "2009",
"authors": [
{
"url": "https://openlibrary.org/authors/OL6548935A/John_Miedema",
"name": "John Miedema"
}
],
"excerpts": [
{
"comment": "test purposes",
"text": "test first page"
}
],
"publish_places": [
{
"name": "Duluth, Minn"
}
]
}
Suddenly, you have over ten fields that need validation. The manual data validation code isn't wieldy.
//...
function validateInput(input) {
if (!input) {
return {
success: false,
message: "Empty object"
}
}
if (!input.publishers || !input.identifiers || !input.classifications || !input.links || !input.weight || !input.title || !input.number_of_pages, /*...*/ ){
//..
}
}
//...
Let's not forget that links need to be verified, lengths need to be tested and ISBNs need to be verified. All this is boilerplate code you shouldn't waste too much time on.
Instead of writing all that code manually, let's build a schema using Celebrate and simplify our code.
First, we need to verify the 'publishers' array. Validating an array of objects can be quite problematic, especially if it's large, but Joi takes away most of the headaches.
//..
const Joi = require('joi');
const publisher = Joi.object().keys({
name: Joi.string().required()
});
const publishers = Joi.array().items(publisher);
//...
Joi doesn't come with an in-built method for validating an ISBN. However, this can be validated with a regex as follows:
const isbn = Joi.string().required().regex(/^(?=(?:\D*\d){10}(?:(?:\D*\d){3})?$)[\d-]+$/)
The (?=(?:\D*\d){10}(?:(?:\D*\d){3})?$)
regex, borrowed from here is a positive lookahead. It ensures we have 10 or 13 digits in the input.
If we wanted to validate the isbn_10
or isbn_13
fields, we could use a combination of the first method and this one.
There are several ways to validate a date with Joi. In our case, the date should always be less than or equal to the current year. We can achieve this with the max
function.
const date = Joi.date().max("now")
If we were an opinionated platform that didn't consider anything published before 2000 as books, we could instead use code like
const year = Joi.number().integer().min(0).max(2000),
and so on...
Validating a link with the Joi API is pretty straightforward
const url = Joi.string().uri()
All the ways Joi can be used are outside the scope of this article. However, everything great about this library can be summarized in its extremely robust API. It supports the following types
string
: this validates strings. It's used like Joi.string()
number
: Joi.number()
supports several operations, including min
and max
as illustrated above.required
: indicates a property is required.any
optional
array
regex
With that information, we can finally conclude the creation of our API with validation.
const bookSchema = {
body: {
title: Joi.string().required(),
publishers: Joi.array().items(publisher),
identifiers: Joi.array().items(identifier),
classifications: Joi.array().items(classification),
links: Joi.array.items(links), //...
publish_date: Joi.date().max(now)
//...
}
};
app.post("/book/new", celebrate(bookSchema), (req, res) => {
//...
});
And finally, we can add automated error messages to every route, if we don't want to manually deal with each of them.
//...
app.use((error, req, res, next) => {
if (error.joi) { //if joi produces an error, it's likely a client-side problem
return res.status(400).json({
error: error.joi.message
});
} //otherwise, it's probably a server-side problem.
return res.status(500).send(error)
});
Posting an invalid object will result in a message like
{
"error": "child \"publishers\" fails because [\"publishers\" is required]"
}
The error messages produced by Joi/Celebrate are convenient for developers, but not so much for the end-user. If you need an error message to present to the end-user, you'll need a simple tweak:
//..
const bookSchema = Joi.object().keys({
title: Joi.string()
.required()
.error(new Error('Please provide a valid title!')),
//..
});
/...
Copyright © 2022 Bradley K.