๐ Why this guide can take your testing skills to the next level
๐ 50+ best practices: Super-comprehensive and exhaustive
This is a guide for JavaScript & Node.js reliability from A-Z. It summarizes and curates for you dozens of the best blog posts, books, and tools the market has to offer
๐ข Advanced: Goes 10,000 miles beyond the basics
Hop into a journey that travels way beyond the basics into advanced topics like testing in production, mutation testing, property-based testing, and many other strategic & professional tools. Should you read every word in this guide your testing skills are likely to go way above the average
๐ Full-stack: front, backend, CI, anything
Start by understanding the ubiquitous testing practices that are the foundation for any application tier. Then, delve into your area of choice: frontend/UI, backend, CI, or maybe all of them?
๐ We have an official Node.js starter - Practica.js. Use it to generate a new solution skeleton with testing baked in, Or just it to learn by testing code examples
Written By Yoni Goldberg
- A JavaScript & Node.js consultant
๐ Testing Node.js & JavaScript From A To Z - My comprehensive online course with more than 7 hours of video- Follow me on Twitter
Translations - read in your own language
-
๐ต๐ฑPolish - Courtesy of Michal Biesiada
-
๐ช๐ธ Spanish - Courtesy of Miguel G. Sanguino -
๐ง๐ทPortuguese-BR - Courtesy of Iago Angelim Costa Cavalcante , Douglas Mariano Valero and koooge
-
๐ซ๐ทFrench - Courtesy of Mathilde El Mouktafi
-
๐ฏ๐ต Japanese (draft) - Courtesy of Yuichi Yogo and ryo -
๐น๐ผTraditional Chinese - Courtesy of Yubin Hsu
-
๐บ๐ฆ Ukrainian - Courtesy of Serhii Shramko -
๐ฎ๐ทPersian - Courtesy of Ali Azmoodeh
-
๐ท๐บ Russian - Courtesy of Alex Popov -
Want to translate to your own language? please open an issue
๐
Table of Contents
Section 0: The Golden Rule
A single advice that inspires all the others (1 special bullet)
Section 1: The Test Anatomy
The foundation - structuring clean tests (12 bullets)
Section 2: Backend
Writing backend and Microservices tests efficiently (13 bullets)
Section 3: Frontend
Writing tests for web UI including component and E2E tests (11 bullets)
Section 4: Measuring Tests Effectiveness
Watching the watchman - measuring test quality (4 bullets)
Section 5: Continuous Integration
Guidelines for CI in the JS world (9 bullets)
0๏ธโฃ : The Golden Rule
Section โช๏ธ 0 The Golden Rule: Design for lean testing
See, our minds are already occupied with our main job - the production code. There is no 'headspace' for additional complexity. Should we try to squeeze yet another sus-system into our poor brain it will slow the team down which works against the reason we do testing. Practically this is where many teams just abandon testing.
The tests are an opportunity for something else - a friendly assistant, co-pilot, that delivers great value for a small investment. Science tells us that we have two brain systems: system 1 is used for effortless activities like driving a car on an empty road and system 2 is meant for complex and conscious operations like solving a math equation. Design your test for system 1, when looking at test code it should feel as easy as modifying an HTML document and not like solving 2X(17 ร 24).
This can be achieved by selectively cherry-picking techniques, tools, and test targets that are cost-effective and provide great ROI. Test only as much as needed, and strive to keep it nimble, sometimes it's even worth dropping some tests and trading reliability for agility and simplicity.
Most of the advice below are derivatives of this principle.
Ready to start?
Section 1: The Test Anatomy
โช ๏ธ 1.1 Include 3 parts in each test name
(1) What is being tested? For example, the ProductsService.addNewProduct method
(2) Under what circumstances and scenario? For example, no price is passed to the method
(3) What is the expected result? For example, the new product is not approved
๐ Note: Each bullet has code examples and sometime also an image illustration. Click to expand
โ Code Examples
๐ Doing It Right Example: A test name that constitutes 3 parts
//1. unit under test
describe('Products Service', function() {
describe('Add new product', function() {
//2. scenario and 3. expectation
it('When no price is specified, then the product status is pending approval', ()=> {
const newProduct = new ProductService().add(...);
expect(newProduct.status).to.equal('pendingApproval');
});
});
});
๐ Doing It Right Example: A test name that constitutes 3 parts
ยฉ Credits & read-more
1. Roy Osherove - Naming standards for unit testsโช ๏ธ 1.2 Structure tests by the AAA pattern
1st A - Arrange: All the setup code to bring the system to the scenario the test aims to simulate. This might include instantiating the unit under test constructor, adding DB records, mocking/stubbing on objects, and any other preparation code
2nd A - Act: Execute the unit under test. Usually 1 line of code
3rd A - Assert: Ensure that the received value satisfies the expectation. Usually 1 line of code
โ Code Examples
๐ Doing It Right Example: A test structured with the AAA pattern
describe("Customer classifier", () => {
test("When customer spent more than 500$, should be classified as premium", () => {
//Arrange
const customerToClassify = { spent: 505, joined: new Date(), id: 1 };
const DBStub = sinon.stub(dataAccess, "getCustomer").reply({ id: 1, classification: "regular" });
//Act
const receivedClassification = customerClassifier.classifyCustomer(customerToClassify);
//Assert
expect(receivedClassification).toMatch("premium");
});
});
๐ Anti-Pattern Example: No separation, one bulk, harder to interpret
test("Should be classified as premium", () => {
const customerToClassify = { spent: 505, joined: new Date(), id: 1 };
const DBStub = sinon.stub(dataAccess, "getCustomer").reply({ id: 1, classification: "regular" });
const receivedClassification = customerClassifier.classifyCustomer(customerToClassify);
expect(receivedClassification).toMatch("premium");
});
โช ๏ธ1.3 Describe expectations in a product language: use BDD-style assertions
expect
or should
and not using custom code. If Chai & Jest doesn't include the desired assertion and itโs highly repeatable, consider extending Jest matcher (Jest) or writing a custom Chai plugin
โ Code Examples
๐ Anti-Pattern Example: The reader must skim through not so short, and imperative code just to get the test story
test("When asking for an admin, ensure only ordered admins in results", () => {
//assuming we've added here two admins "admin1", "admin2" and "user1"
const allAdmins = getUsers({ adminOnly: true });
let admin1Found,
adming2Found = false;
allAdmins.forEach(aSingleUser => {
if (aSingleUser === "user1") {
assert.notEqual(aSingleUser, "user1", "A user was found and not admin");
}
if (aSingleUser === "admin1") {
admin1Found = true;
}
if (aSingleUser === "admin2") {
admin2Found = true;
}
});
if (!admin1Found || !admin2Found) {
throw new Error("Not all admins were returned");
}
});
๐ Doing It Right Example: Skimming through the following declarative test is a breeze
it("When asking for an admin, ensure only ordered admins in results", () => {
//assuming we've added here two admins
const allAdmins = getUsers({ adminOnly: true });
expect(allAdmins)
.to.include.ordered.members(["admin1", "admin2"])
.but.not.include.ordered.members(["user1"]);
});
โช ๏ธ 1.4 Stick to black-box testing: Test only public methods
behavioral testing
. On the other side, should you test the internals (white box approach)โโโyour focus shifts from planning the component outcome to nitty-gritty details and your test might break because of minor code refactors although the results are fine - this dramatically increases the maintenance burden
โ Code Examples
๐ Anti-Pattern Example: A test case is testing the internals for no good reason
class ProductService {
//this method is only used internally
//Change this name will make the tests fail
calculateVATAdd(priceWithoutVAT) {
return { finalPrice: priceWithoutVAT * 1.2 };
//Change the result format or key name above will make the tests fail
}
//public method
getPrice(productId) {
const desiredProduct = DB.getProduct(productId);
const finalPrice = this.calculateVATAdd(desiredProduct.price).finalPrice;
return finalPrice;
}
}
it("White-box test: When the internal methods get 0 vat, it return 0 response", async () => {
//There's no requirement to allow users to calculate the VAT, only show the final price. Nevertheless we falsely insist here to test the class internals
expect(new ProductService().calculateVATAdd(0).finalPrice).to.equal(0);
});
โช ๏ธ ๏ธ1.5 Choose the right test doubles: Avoid mocks in favor of stubs and spies
โ Do: Test doubles are a necessary evil because they are coupled to the application internals, yet some provide immense value (Read here a reminder about test doubles: mocks vs stubs vs spies).
Before using test doubles, ask a very simple question: Do I use it to test functionality that appears, or could appear, in the requirements document? If not, itโs a white-box testing smell.
For example, if you want to test that your app behaves reasonably when the payment service is down, you might stub the payment service and trigger some โNo Responseโ return to ensure that the unit under test returns the right value. This checks our application behavior/response/outcome under certain scenarios. You might also use a spy to assert that an email was sent when that service is downโโโthis is again a behavioral check which is likely to appear in a requirements doc (โSend an email if payment couldnโt be savedโ). On the flip side, if you mock the Payment service and ensure that it was called with the right JavaScript typesโโโthen your test is focused on internal things that have nothing to do with the application functionality and are likely to change frequently
โ Code Examples
๐ Anti-pattern example: Mocks focus on the internals
it("When a valid product is about to be deleted, ensure data access DAL was called once, with the right product and right config", async () => {
//Assume we already added a product
const dataAccessMock = sinon.mock(DAL);
//hmmm BAD: testing the internals is actually our main goal here, not just a side-effect
dataAccessMock
.expects("deleteProduct")
.once()
.withArgs(DBConfig, theProductWeJustAdded, true, false);
new ProductService().deletePrice(theProductWeJustAdded);
dataAccessMock.verify();
});
๐Doing It Right Example: spies are focused on testing the requirements but as a side-effect are unavoidably touching to the internals
it("When a valid product is about to be deleted, ensure an email is sent", async () => {
//Assume we already added here a product
const spy = sinon.spy(Emailer.prototype, "sendEmail");
new ProductService().deletePrice(theProductWeJustAdded);
//hmmm OK: we deal with internals? Yes, but as a side effect of testing the requirements (sending an email)
expect(spy.calledOnce).to.be.true;
});
๐ Want to learn all these practices with live video?
Testing Node.js & JavaScript From A To Z
Visit my online courseโช ๏ธ1.6 Donโt โfooโ, use realistic input data
โ
Do: Often production bugs are revealed under some very specific and surprising inputโโโthe more realistic the test input is, the greater the chances are to catch bugs early. Use dedicated libraries like Chance or Faker to generate pseudo-real data that resembles the variety and form of production data. For example, such libraries can generate realistic phone numbers, usernames, credit cards, company names, and even โlorem ipsumโ text. You may also create some tests (on top of unit tests, not as a replacement) that randomize fakers' data to stretch your unit under test or even import real data from your production environment. Want to take it to the next level? See the next bullet (property-based testing).
โ Otherwise: All your development testing will falsely show green when you use synthetic inputs like โFooโ, but then production might turn red when a hacker passes-in a nasty string like โ@3e2ddsf . ##โ 1 fdsfds . fds432 AAAAโ
โ Code Examples
๐ Anti-Pattern Example: A test suite that passes due to non-realistic data
const addProduct = (name, price) => {
const productNameRegexNoSpace = /^\S*$/; //no white-space allowed
if (!productNameRegexNoSpace.test(name)) return false; //this path never reached due to dull input
//some logic here
return true;
};
test("Wrong: When adding new product with valid properties, get successful confirmation", async () => {
//The string "Foo" which is used in all tests never triggers a false result
const addProductResult = addProduct("Foo", 5);
expect(addProductResult).toBe(true);
//Positive-false: the operation succeeded because we never tried with long
//product name including spaces
});
๐ Doing It Right Example: Randomizing realistic input
it("Better: When adding new valid product, get successful confirmation", async () => {
const addProductResult = addProduct(faker.commerce.productName(), faker.random.number());
//Generated random input: {'Sleek Cotton Computer', 85481}
expect(addProductResult).to.be.true;
//Test failed, the random input triggered some path we never planned for.
//We discovered a bug early!
});
โช ๏ธ 1.7 Test many input combinations using Property-based testing
โ
Do: Typically we choose a few input samples for each test. Even when the input format resembles real-world data (see bullet โDonโt fooโ), we cover only a few input combinations (method(โโ, true, 1), method(โstringโ , false , 0)), However, in production, an API that is called with 5 parameters can be invoked with thousands of different permutations, one of them might render our process down (see Fuzz Testing). What if you could write a single test that sends 1000 permutations of different inputs automatically and catches for which input our code fails to return the right response? Property-based testing is a technique that does exactly that: sending all the possible input combinations to your unit under test it increases the serendipity of finding a bug. For example, given a methodโโโaddNewProduct(id, name, isDiscount)โโโthe supporting libraries will call this method with many combinations of (number, string, boolean) like (1, โiPhoneโ, false), (2, โGalaxyโ, true). You can run property-based testing using your favorite test runner (Mocha, Jest, etc) using libraries like js-verify or testcheck (much better documentation). Update: Nicolas Dubien suggests in the comments below to checkout fast-check which seems to offer some additional features and also to be actively maintained
โ Code Examples
๐ Doing It Right Example: Testing many input permutations with โfast-checkโ
import fc from "fast-check";
describe("Product service", () => {
describe("Adding new", () => {
//this will run 100 times with different random properties
it("Add new product with random yet valid properties, always successful", () =>
fc.assert(
fc.property(fc.integer(), fc.string(), (id, name) => {
expect(addNewProduct(id, name).status).toEqual("approved");
})
));
});
});
โช ๏ธ 1.8 If needed, use only short & inline snapshots
โ Do: When there is a need for snapshot testing, use only short and focused snapshots (i.e. 3-7 lines) that are included as part of the test (Inline Snapshot) and not within external files. Keeping this guideline will ensure your tests remain self-explanatory and less fragile.
On the other hand, โclassic snapshotsโ tutorials and tools encourage storing big files (e.g. component rendering markup, API JSON result) over some external medium and ensure each time when the test runs to compare the received result with the saved version. This, for example, can implicitly couple our test to 1000 lines with 3000 data values that the test writer never read and reasoned about. Why is this wrong? By doing so, there are 1000 reasons for your test to fail - itโs enough for a single line to change for the snapshot to get invalid and this is likely to happen a lot. How frequently? for every space, comment, or minor CSS/HTML change. Not only this, the test name wouldnโt give a clue about the failure as it just checks that 1000 lines didnโt change, also it encourages the test writer to accept as the desired true a long document he couldnโt inspect and verify. All of these are symptoms of obscure and eager test that is not focused and aims to achieve too much
Itโs worth noting that there are few cases where long & external snapshots are acceptable - when asserting on schema and not data (extracting out values and focusing on fields) or when the received document rarely changes
โ Otherwise: A UI test fails. The code seems right, the screen renders perfect pixels, what happened? your snapshot testing just found a difference from the original document to the current received one - a single space character was added to the markdown...
โ Code Examples
๐ Anti-Pattern Example: Coupling our test to unseen 2000 lines of code
it("TestJavaScript.com is renderd correctly", () => {
//Arrange
//Act
const receivedPage = renderer
.create(<DisplayPage page="http://www.testjavascript.com"> Test JavaScript </DisplayPage>)
.toJSON();
//Assert
expect(receivedPage).toMatchSnapshot();
//We now implicitly maintain a 2000 lines long document
//every additional line break or comment - will break this test
});
๐ Doing It Right Example: Expectations are visible and focused
it("When visiting TestJavaScript.com home page, a menu is displayed", () => {
//Arrange
//Act
const receivedPage = renderer
.create(<DisplayPage page="http://www.testjavascript.com"> Test JavaScript </DisplayPage>)
.toJSON();
//Assert
const menu = receivedPage.content.menu;
expect(menu).toMatchInlineSnapshot(`
<ul>
<li>Home</li>
<li> About </li>
<li> Contact </li>
</ul>
`);
});
โช ๏ธ 1.9 Copy code, but only what's neccessary
โ
Do: Include all the necessary details that affect the test result, but nothing more. As an example, consider a test that should factor 100 lines of input JSONโ-โPasting this in every test is tedious. Extracting it outside to transferFactory.getJSON() will leave the test vagueโ-โWithout data, it's hard to correlate the test result with the cause ("why is it supposed to return 400 status?"). The classic book x-unit patterns named this pattern 'the mystery guest'โ-โSomething unseen affected our test results, we don't know what exactly. We can do better by extracting repeatable long parts outside AND mentioning explicitly which specific details matter to the test. Going with the example above, the test can pass parameters that highlight what is important: transferFactory.getJSON({sender: undefined}). In this example, the reader should immediately infer that the empty sender field is the reason why the test should expect a validation error or any other similar adequate outcome.
โ Code Examples
๐ Anti-Pattern Example: The test failure is unclear because all the cause is external and hides within huge JSON
test("When no credit, then the transfer is declined", async() => {
// Arrange
const transferRequest = testHelpers.factorMoneyTransfer() //get back 200 lines of JSON;
const transferServiceUnderTest = new TransferService();
// Act
const transferResponse = await transferServiceUnderTest.transfer(transferRequest);
// Assert
expect(transferResponse.status).toBe(409);// But why do we expect failure: All seems perfectly valid in the test ๐ค
});
๐ Doing It Right Example: The test highlights what is the cause of the test result
test("When no credit, then the transfer is declined ", async() => {
// Arrange
const transferRequest = testHelpers.factorMoneyTransfer({userCredit:100, transferAmount:200}) //obviously there is lack of credit
const transferServiceUnderTest = new TransferService({disallowOvercharge:true});
// Act
const transferResponse = await transferServiceUnderTest.transfer(transferRequest);
// Assert
expect(transferResponse.status).toBe(409); // Obviously if the user has no credit it should fail
});
โช ๏ธ 1.10 Donโt catch errors, expect them
โ Do: When trying to assert that some input triggers an error, it might look right to use try-catch-finally and asserts that the catch clause was entered. The result is an awkward and verbose test case (example below) that hides the simple test intent and the result expectations
A more elegant alternative is the using the one-line dedicated Chai assertion: expect(method).to.throw (or in Jest: expect(method).toThrow()). Itโs absolutely mandatory to also ensure the exception contains a property that tells the error type, otherwise given just a generic error the application wonโt be able to do much rather than show a disappointing message to the user
โ Otherwise: It will be challenging to infer from the test reports (e.g. CI reports) what went wrong
โ Code Examples
๐ Anti-pattern Example: A long test case that tries to assert the existence of error with try-catch
it("When no product name, it throws error 400", async () => {
let errorWeExceptFor = null;
try {
const result = await addNewProduct({});
} catch (error) {
expect(error.code).to.equal("InvalidInput");
errorWeExceptFor = error;
}
expect(errorWeExceptFor).not.to.be.null;
//if this assertion fails, the tests results/reports will only show
//that some value is null, there won't be a word about a missing Exception
});
๐ Doing It Right Example: A human-readable expectation that could be understood easily, maybe even by QA or technical PM
it("When no product name, it throws error 400", async () => {
await expect(addNewProduct({}))
.to.eventually.throw(AppError)
.with.property("code", "InvalidInput");
});
โช ๏ธ 1.11 Tag your tests
โ
Do: Different tests must run on different scenarios: quick smoke, IO-less, tests should run when a developer saves or commits a file, full end-to-end tests usually run when a new pull request is submitted, etc. This can be achieved by tagging tests with keywords like #cold #api #sanity so you can grep with your testing harness and invoke the desired subset. For example, this is how you would invoke only the sanity test group with Mocha: mochaโโโgrep โsanityโ
โ Otherwise: Running all the tests, including tests that perform dozens of DB queries, any time a developer makes a small change can be extremely slow and keeps developers away from running tests
โ Code Examples
๐ Doing It Right Example: Tagging tests as โ#cold-testโ allows the test runner to execute only fast tests (Cold===quick tests that are doing no IO and can be executed frequently even as the developer is typing)
//this test is fast (no DB) and we're tagging it correspondigly
//now the user/CI can run it frequently
describe("Order service", function() {
describe("Add new order #cold-test #sanity", function() {
test("Scenario - no currency was supplied. Expectation - Use the default currency #sanity", function() {
//code logic here
});
});
});
โช ๏ธ 1.12 Categorize tests under at least 2 levels
โ Do: Apply some structure to your test suite so an occasional visitor could easily understand the requirements (tests are the best documentation) and the various scenarios that are being tested. A common method for this is by placing at least 2 'describe' blocks above your tests: the 1st is for the name of the unit under test and the 2nd for an additional level of categorization like the scenario or custom categories (see code examples and the print screen below). Doing so will also greatly improve the test reports: The reader will easily infer the test categories, delve into the desired section and correlate failing tests. In addition, it will get much easier for a developer to navigate through the code of a suite with many tests. There are multiple alternative structures for the test suite that you may consider like given-when-then and RITE
โ Code Examples
๐ Doing It Right Example: Structuring suite with the name of unit under test and scenarios will lead to the convenient report that is shown below
// Unit under test
describe("Transfer service", () => {
//Scenario
describe("When no credit", () => {
//Expectation
test("Then the response status should decline", () => {});
//Expectation
test("Then it should send email to admin", () => {});
});
});
๐ Anti-pattern Example: A flat list of tests will make it harder for the reader to identify the user stories and correlate failing tests
test("Then the response status should decline", () => {});
test("Then it should send email", () => {});
test("Then there should not be a new transfer record", () => {});
โช ๏ธ1.13 Other generic good testing hygiene
Learn and practice TDD principlesโโโthey are extremely valuable for many but donโt get intimidated if they donโt fit your style, youโre not the only one. Consider writing the tests before the code in a red-green-refactor style, ensure each test checks exactly one thing, when you find a bugโโโbefore fixing write a test that will detect this bug in the future, and let each test fail at least once before turning green, start a module by writing a quick and simplistic code that satisfies the test - then refactor gradually and take it to a production grade level, avoid any dependency on the environment (paths, OS, etc)
โ Otherwise: Youโll miss pearls of wisdom that were collected for decades
2๏ธโฃ : Backend Testing
Section โช ๏ธ2.1 Enrich your testing portfolio: Look beyond unit tests and the pyramid
Donโt get me wrong, in 2019 the testing pyramid, TDD, and unit tests are still a powerful technique and are probably the best match for many applications. Only like any other model, despite its usefulness, it must be wrong sometimes. For example, consider an IoT application that ingests many events into a message-bus like Kafka/RabbitMQ, which then flow into some data-warehouse and are eventually queried by some analytics UI. Should we really spend 50% of our testing budget on writing unit tests for an application that is integration-centric and has almost no logic? As the diversity of application types increases (bots, crypto, Alexa-skills) greater are the chances to find scenarios where the testing pyramid is not the best match.
Itโs time to enrich your testing portfolio and become familiar with more testing types (the next bullets suggest a few ideas), mind models like the testing pyramid but also match testing types to real-world problems that youโre facing (โHey, our API is broken, letโs write consumer-driven contract testing!โ), diversify your tests like an investor that builds a portfolio based on risk analysisโโโassess where problems might arise and match some prevention measures to mitigate those potential risks
A word of caution: the TDD argument in the software world takes a typical false-dichotomy face, some preach to use it everywhere, and others think itโs the devil. Everyone who speaks in absolutes is wrong :]
โ Code Examples
๐ Doing It Right Example: Cindy Sridharan suggests a rich testing portfolio in her amazing post โTesting Microservicesโโโthe same wayโ
โช ๏ธ2.2 Component testing might be your best affair
Component tests focus on the Microservice โunitโ, they work against the API and donโt mock anything which belongs to the Microservice itself (e.g. real DB, or at least the in-memory version of that DB) but stub anything that is external like calls to other Microservices. By doing so, we test what we deploy, approach the app from outward to inward and gain great confidence in a reasonable amount of time.
We have a full guide that is solely dedicated to writing component tests in the right way
โ Code Examples
๐ Doing It Right Example: Supertest allows approaching Express API in-process (fast and cover many layers)
โช ๏ธ2.3 Ensure new releases donโt break the API using contract tests
โช ๏ธ 2.4 Test your middlewares in isolation
โ
Do: Many avoid Middleware testing because they represent a small portion of the system and require a live Express server. Both reasons are wrongโโโMiddlewares are small but affect all or most of the requests and can be tested easily as pure functions that get {req,res} JS objects. To test a middleware function one should just invoke it and spy (using Sinon for example) on the interaction with the {req,res} objects to ensure the function performed the right action. The library node-mock-http takes it even further and factors the {req,res} objects along with spying on their behavior. For example, it can assert whether the http status that was set on the res object matches the expectation (See example below)
โ Otherwise: A bug in Express middleware === a bug in all or most requests
โ Code Examples
๐Doing It Right Example: Testing middleware in isolation without issuing network calls and waking-up the entire Express machine
//the middleware we want to test
const unitUnderTest = require("./middleware");
const httpMocks = require("node-mocks-http");
//Jest syntax, equivelant to describe() & it() in Mocha
test("A request without authentication header, should return http status 403", () => {
const request = httpMocks.createRequest({
method: "GET",
url: "/user/42",
headers: {
authentication: ""
}
});
const response = httpMocks.createResponse();
unitUnderTest(request, response);
expect(response.statusCode).toBe(403);
});
โช ๏ธ2.5 Measure and refactor using static analysis tools
Credit: Keith Holliday
โ Code Examples
๐ Doing It Right Example: CodeClimate, a commercial tool that can identify complex methods:
โช ๏ธ 2.6 Check your readiness for Node-related chaos
โ
Do: Weirdly, most software testings are about logic & data only, but some of the worst things that happen (and are really hard to mitigate) are infrastructural issues. For example, did you ever test what happens when your process memory is overloaded, or when the server/process dies, or does your monitoring system realizes when the API becomes 50% slower?. To test and mitigate these type of bad thingsโโโChaos engineering was born by Netflix. It aims to provide awareness, frameworks and tools for testing our app resiliency for chaotic issues. For example, one of its famous tools, the chaos monkey, randomly kills servers to ensure that our service can still serve users and not relying on a single server (there is also a Kubernetes version, kube-monkey, that kills pods). All these tools work on the hosting/platform level, but what if you wish to test and generate pure Node chaos like check how your Node process copes with uncaught errors, unhandled promise rejection, v8 memory overloaded with the max allowed of 1.7GB or whether your UX remains satisfactory when the event loop gets blocked often? to address this Iโve written, node-chaos (alpha) which provides all sort of Node-related chaotic acts
โ Code Examples
๐ Doing It Right Example: : Node-chaos can generate all sort of Node.js pranks so you can test how resilience is your app to chaos
โช ๏ธ2.7 Avoid global test fixtures and seeds, add data per-test
โ Code Examples
๐ Anti-Pattern Example: tests are not independent and rely on some global hook to feed global DB data
before(async () => {
//adding sites and admins data to our DB. Where is the data? outside. At some external json or migration framework
await DB.AddSeedDataFromJson('seed.json');
});
it("When updating site name, get successful confirmation", async () => {
//I know that site name "portal" exists - I saw it in the seed files
const siteToUpdate = await SiteService.getSiteByName("Portal");
const updateNameResult = await SiteService.changeName(siteToUpdate, "newName");
expect(updateNameResult).to.be(true);
});
it("When querying by site name, get the right site", async () => {
//I know that site name "portal" exists - I saw it in the seed files
const siteToCheck = await SiteService.getSiteByName("Portal");
expect(siteToCheck.name).to.be.equal("Portal"); //Failure! The previous test change the name :[
});
๐ Doing It Right Example: We can stay within the test, each test acts on its own set of data
it("When updating site name, get successful confirmation", async () => {
//test is adding a fresh new records and acting on the records only
const siteUnderTest = await SiteService.addSite({
name: "siteForUpdateTest"
});
const updateNameResult = await SiteService.changeName(siteUnderTest, "newName");
expect(updateNameResult).to.be(true);
});
โช ๏ธ2.8 Choose a clear data clean-up strategy: After-all (recommended) or after-each
โ Do: The timing when the tests clean the database determines the way the tests are being written. The two most viable options are cleaning after all the tests vs cleaning after every single test. Choosing the latter option, cleaning after every single test guarantees clean tables and builds convenient testing perks for the developer. No other records exist when the test starts, one can have certainty which data is being queried and even might be tempted to count rows during assertions. This comes with severe downsides: When running in a multi-process mode, tests are likely to interfere with each other. While process-1 purges tables, at the very moment process-2 queries for data and fail (because the DB was suddenly deleted by process-1). On top of this, It's harder to troubleshoot failing tests - Visiting the DB will show no records.
The second option is to clean up after all the test files have finished (or even daily!). This approach means that the same DB with existing records serves all the tests and processes. To avoid stepping on each other's toes, the tests must add and act on specific records that they have added. Need to check that some record was added? Assume that there are other thousands of records and query for records that were added explicitly. Need to check that a record was deleted? Can't assume an empty table, check that this specific record is not there. This technique brings few powerful gains: It works natively in multi-process mode, when a developer wishes to understand what happened - the data is there and not deleted. It also increases the chance of finding bugs because the DB is full of records and not artificially empty. See the full comparison table here.
โ Code Examples
๐ Cleaning after ALL the tests. Not neccesserily after every run. The more data we have while the tests are running - The more it resembles the production perks
// After-all clean up (recommended)
// global-teardown.js
module.exports = async () => {
// ...
if (Math.ceil(Math.random() * 10) === 10) {
await new OrderRepository().cleanup();
}
};
โช ๏ธ2.9 Isolate the component from the world using HTTP interceptor
โ
Do: Isolate the component under test by intercepting any outgoing HTTP request and providing the desired response so the collaborator HTTP API won't get hit. Nock is a great tool for this mission as it provides a convenient syntax for defining external services behavior. Isolation is a must to prevent noise and slow performance but mostly to simulate various scenarios and responses - A good flight simulator is not about painting clear blue sky rather bringing safe storms and chaos. This is reinforced in a Microservice architecture where the focus should always be on a single component without involving the rest of the world. Though it's possible to simulate external service behavior using test doubles (mocking), it's preferable not to touch the deployed code and act on the network level to keep the tests pure black-box. The downside of isolation is not detecting when the collaborator component changes and not realizing misunderstandings between the two services - Make sure to compensate for this using a few contract or E2E tests
โ Code Examples
๐ Preventing network calls to externous components allows simulating scenarios and minimizing the noise
// Intercept requests for 3rd party APIs and return a predefined response
beforeEach(() => {
nock('http://localhost/user/').get(`/1`).reply(200, {
id: 1,
name: 'John',
});
});
โช ๏ธ2.10 Test the response schema, mostly when there are auto-generated fields
โ Code Examples
๐ Asserting that fields with dynamic value exist and have the right type
test('When adding a new valid order, Then should get back approval with 200 response', async () => {
// ...
//Assert
expect(receivedAPIResponse).toMatchObject({
status: 200,
data: {
id: expect.any(Number), // Any number satisfies this test
mode: 'approved',
},
});
});
โช ๏ธ2.11 Check integrations corner cases and chaos
โ Code Examples
๐ Ensuring that on network failures, the circuit breaker can save the day
test('When users service replies with 503 once and retry mechanism is applied, then an order is added successfully', async () => {
//Arrange
nock.removeInterceptor(userServiceNock.interceptors[0])
nock('http://localhost/user/')
.get('/1')
.reply(503, undefined, { 'Retry-After': 100 });
nock('http://localhost/user/')
.get('/1')
.reply(200);
const orderToAdd = {
userId: 1,
productId: 2,
mode: 'approved',
};
//Act
const response = await axiosAPIClient.post('/order', orderToAdd);
//Assert
expect(response.status).toBe(200);
});
โช ๏ธ2.12 Test the five potential outcomes
โข Response - The test invokes an action (e.g., via API) and gets a response. It's now concerned with checking the response data correctness, schema, and HTTP status
โข A new state - After invoking an action, some publicly accessible data is probably modified
โข External calls - After invoking an action, the app might call an external component via HTTP or any other transport. For example, a call to send SMS, email or charge a credit card
โข Message queues - The outcome of a flow might be a message in a queue
โข Observability - Some things must be monitored, like errors or remarkable business events. When a transaction fails, not only we expect the right response but also correct error handling and proper logging/metrics. This information goes directly to a very important user - The ops user (i.e., production SRE/admin)
Section 3๏ธโฃ: Frontend Testing
โช ๏ธ 3.1 Separate UI from functionality
โ Code Examples
๐ Doing It Right Example: Separating out the UI details
test("When users-list is flagged to show only VIP, should display only VIP members", () => {
// Arrange
const allUsers = [{ id: 1, name: "Yoni Goldberg", vip: false }, { id: 2, name: "John Doe", vip: true }];
// Act
const { getAllByTestId } = render(<UsersList users={allUsers} showOnlyVIP={true} />);
// Assert - Extract the data from the UI first
const allRenderedUsers = getAllByTestId("user").map(uiElement => uiElement.textContent);
const allRealVIPUsers = allUsers.filter(user => user.vip).map(user => user.name);
expect(allRenderedUsers).toEqual(allRealVIPUsers); //compare data with data, no UI here
});
๐ Anti-Pattern Example: Assertion mix UI details and data
test("When flagging to show only VIP, should display only VIP members", () => {
// Arrange
const allUsers = [{ id: 1, name: "Yoni Goldberg", vip: false }, { id: 2, name: "John Doe", vip: true }];
// Act
const { getAllByTestId } = render(<UsersList users={allUsers} showOnlyVIP={true} />);
// Assert - Mix UI & data in assertion
expect(getAllByTestId("user")).toEqual('[<li data-test-id="user">John Doe</li>]');
});
โช ๏ธ 3.2 Query HTML elements based on attributes that are unlikely to change
โ Do: Query HTML elements based on attributes that are likely to survive graphic changes unlike CSS selectors and like form labels. If the designated element doesn't have such attributes, create a dedicated test attribute like 'test-id-submit-button'. Going this route not only ensures that your functional/logic tests never break because of look & feel changes but also it becomes clear to the entire team that this element and attribute are utilized by tests and shouldn't get removed
โ Otherwise: You want to test the login functionality that spans many components, logic and services, everything is set up perfectly - stubs, spies, Ajax calls are isolated. All seems perfect. Then the test fails because the designer changed the div CSS class from 'thick-border' to 'thin-border'
โ Code Examples
๐ Doing It Right Example: Querying an element using a dedicated attribute for testing
// the markup code (part of React component)
<h3>
<Badge pill className="fixed_badge" variant="dark">
<span data-test-id="errorsLabel">{value}</span>
<!-- note the attribute data-test-id -->
</Badge>
</h3>
// this example is using react-testing-library
test("Whenever no data is passed to metric, show 0 as default", () => {
// Arrange
const metricValue = undefined;
// Act
const { getByTestId } = render(<dashboardMetric value={undefined} />);
expect(getByTestId("errorsLabel").text()).toBe("0");
});
๐ Anti-Pattern Example: Relying on CSS attributes
<!-- the markup code (part of React component) -->
<span id="metric" className="d-flex-column">{value}</span>
<!-- what if the designer changes the classs? -->
// this exammple is using enzyme
test("Whenever no data is passed, error metric shows zero", () => {
// ...
expect(wrapper.find("[className='d-flex-column']").text()).toBe("0");
});
โช ๏ธ 3.3 Whenever possible, test with a realistic and fully rendered component
โ Do: Whenever reasonably sized, test your component from outside like your users do, fully render the UI, act on it and assert that the rendered UI behaves as expected. Avoid all sort of mocking, partial and shallow rendering - this approach might result in untrapped bugs due to lack of details and harden the maintenance as the tests mess with the internals (see bullet 'Favour blackbox testing'). If one of the child components is significantly slowing down (e.g. animation) or complicating the setup - consider explicitly replacing it with a fake
With all that said, a word of caution is in order: this technique works for small/medium components that pack a reasonable size of child components. Fully rendering a component with too many children will make it hard to reason about test failures (root cause analysis) and might get too slow. In such cases, write only a few tests against that fat parent component and more tests against its children
โ Code Examples
๐ Doing It Right Example: Working realistically with a fully rendered component
class Calendar extends React.Component {
static defaultProps = { showFilters: false };
render() {
return (
<div>
A filters panel with a button to hide/show filters
<FiltersPanel showFilter={showFilters} title="Choose Filters" />
</div>
);
}
}
//Examples use React & Enzyme
test("Realistic approach: When clicked to show filters, filters are displayed", () => {
// Arrange
const wrapper = mount(<Calendar showFilters={false} />);
// Act
wrapper.find("button").simulate("click");
// Assert
expect(wrapper.text().includes("Choose Filter"));
// This is how the user will approach this element: by text
});
๐ Anti-Pattern Example: Mocking the reality with shallow rendering
test("Shallow/mocked approach: When clicked to show filters, filters are displayed", () => {
// Arrange
const wrapper = shallow(<Calendar showFilters={false} title="Choose Filter" />);
// Act
wrapper
.find("filtersPanel")
.instance()
.showFilters();
// Tap into the internals, bypass the UI and invoke a method. White-box approach
// Assert
expect(wrapper.find("Filter").props()).toEqual({ title: "Choose Filter" });
// what if we change the prop name or don't pass anything relevant?
});
โช ๏ธ 3.4 Don't sleep, use frameworks built-in support for async events. Also try to speed things up
โ
Do: In many cases, the unit under test completion time is just unknown (e.g. animation suspends element appearance) - in that case, avoid sleeping (e.g. setTimeOut) and prefer more deterministic methods that most platforms provide. Some libraries allows awaiting on operations (e.g. Cypress cy.request('url')), other provide API for waiting like @testing-library/dom method wait(expect(element)). Sometimes a more elegant way is to stub the slow resource, like API for example, and then once the response moment becomes deterministic the component can be explicitly re-rendered. When depending upon some external component that sleeps, it might turn useful to hurry-up the clock. Sleeping is a pattern to avoid because it forces your test to be slow or risky (when waiting for a too short period). Whenever sleeping and polling is inevitable and there's no support from the testing framework, some npm libraries like wait-for-expect can help with a semi-deterministic solution
โ Otherwise: When sleeping for a long time, tests will be an order of magnitude slower. When trying to sleep for small numbers, test will fail when the unit under test didn't respond in a timely fashion. So it boils down to a trade-off between flakiness and bad performance
โ Code Examples
๐ Doing It Right Example: E2E API that resolves only when the async operations is done (Cypress)
// using Cypress
cy.get("#show-products").click(); // navigate
cy.wait("@products"); // wait for route to appear
// this line will get executed only when the route is ready
๐ Doing It Right Example: Testing library that waits for DOM elements
// @testing-library/dom
test("movie title appears", async () => {
// element is initially not present...
// wait for appearance
await wait(() => {
expect(getByText("the lion king")).toBeInTheDocument();
});
// wait for appearance and return the element
const movie = await waitForElement(() => getByText("the lion king"));
});
๐ Anti-Pattern Example: custom sleep code
test("movie title appears", async () => {
// element is initially not present...
// custom wait logic (caution: simplistic, no timeout)
const interval = setInterval(() => {
const found = getByText("the lion king");
if (found) {
clearInterval(interval);
expect(getByText("the lion king")).toBeInTheDocument();
}
}, 100);
// wait for appearance and return the element
const movie = await waitForElement(() => getByText("the lion king"));
});
โช ๏ธ 3.5 Watch how the content is served over the network
โช ๏ธ 3.6 Stub flaky and slow resources like backend APIs
โ Do: When coding your mainstream tests (not E2E tests), avoid involving any resource that is beyond your responsibility and control like backend API and use stubs instead (i.e. test double). Practically, instead of real network calls to APIs, use some test double library (like Sinon, Test doubles, etc) for stubbing the API response. The main benefit is preventing flakiness - testing or staging APIs by definition are not highly stable and from time to time will fail your tests although YOUR component behaves just fine (production env was not meant for testing and it usually throttles requests). Doing this will allow simulating various API behavior that should drive your component behavior as when no data was found or the case when API throws an error. Last but not least, network calls will greatly slow down the tests
โ Code Examples
๐ Doing It Right Example: Stubbing or intercepting API calls
// unit under test
export default function ProductsList() {
const [products, setProducts] = useState(false);
const fetchProducts = async () => {
const products = await axios.get("api/products");
setProducts(products);
};
useEffect(() => {
fetchProducts();
}, []);
return products ? <div>{products}</div> : <div data-test-id="no-products-message">No products</div>;
}
// test
test("When no products exist, show the appropriate message", () => {
// Arrange
nock("api")
.get(`/products`)
.reply(404);
// Act
const { getByTestId } = render(<ProductsList />);
// Assert
expect(getByTestId("no-products-message")).toBeTruthy();
});
โช ๏ธ 3.7 Have very few end-to-end tests that spans the whole system
โ Do: Although E2E (end-to-end) usually means UI-only testing with a real browser (See bullet 3.6), for other they mean tests that stretch the entire system including the real backend. The latter type of tests is highly valuable as they cover integration bugs between frontend and backend that might happen due to a wrong understanding of the exchange schema. They are also an efficient method to discover backend-to-backend integration issues (e.g. Microservice A sends the wrong message to Microservice B) and even to detect deployment failures - there are no backend frameworks for E2E testing that are as friendly and mature as UI frameworks like Cypress and Puppeteer. The downside of such tests is the high cost of configuring an environment with so many components, and mostly their brittleness - given 50 microservices, even if one fails then the entire E2E just failed. For that reason, we should use this technique sparingly and probably have 1-10 of those and no more. That said, even a small number of E2E tests are likely to catch the type of issues they are targeted for - deployment & integration faults. It's advisable to run those over a production-like staging environment
โช ๏ธ 3.8 Speed-up E2E tests by reusing login credentials
โ Do: In E2E tests that involve a real backend and rely on a valid user token for API calls, it doesn't payoff to isolate the test to a level where a user is created and logged-in in every request. Instead, login only once before the tests execution start (i.e. before-all hook), save the token in some local storage and reuse it across requests. This seem to violate one of the core testing principle - keep the test autonomous without resources coupling. While this is a valid worry, in E2E tests performance is a key concern and creating 1-3 API requests before starting each individual tests might lead to horrible execution time. Reusing credentials doesn't mean the tests have to act on the same user records - if relying on user records (e.g. test user payments history) than make sure to generate those records as part of the test and avoid sharing their existence with other tests. Also remember that the backend can be faked - if your tests are focused on the frontend it might be better to isolate it and stub the backend API (see bullet 3.6).
โ Otherwise: Given 200 test cases and assuming login=100ms = 20 seconds only for logging-in again and again
โ Code Examples
๐ Doing It Right Example: Logging-in before-all and not before-each
let authenticationToken;
// happens before ALL tests run
before(() => {
cy.request('POST', 'http://localhost:3000/login', {
username: Cypress.env('username'),
password: Cypress.env('password'),
})
.its('body')
.then((responseFromLogin) => {
authenticationToken = responseFromLogin.token;
})
})
// happens before EACH test
beforeEach(setUser => {
cy.visit('/home', () => {
onBeforeLoad (win => {
win.localStorage.setItem('token', JSON.stringify(authenticationToken))
})
})
})
โช ๏ธ 3.9 Have one E2E smoke test that just travels across the site map
โ Otherwise: Everything might seem perfect, all tests pass, production health-check is also positive but the Payment component had some packaging issue and only the /Payment route is not rendering
โ Code Examples
๐ Doing It Right Example: Smoke travelling across all pages
it("When doing smoke testing over all page, should load them all successfully", () => {
// exemplified using Cypress but can be implemented easily
// using any E2E suite
cy.visit("https://mysite.com/home");
cy.contains("Home");
cy.visit("https://mysite.com/Login");
cy.contains("Login");
cy.visit("https://mysite.com/About");
cy.contains("About");
});
โช ๏ธ 3.10 Expose the tests as a live collaborative document
โ Do: Besides increasing app reliability, tests bring another attractive opportunity to the table - serve as live app documentation. Since tests inherently speak at a less-technical and product/UX language, using the right tools they can serve as a communication artifact that greatly aligns all the peers - developers and their customers. For example, some frameworks allow expressing the flow and expectations (i.e. tests plan) using a human-readable language so any stakeholder, including product managers, can read, approve and collaborate on the tests which just became the live requirements document. This technique is also being referred to as 'acceptance test' as it allows the customer to define his acceptance criteria in plain language. This is BDD (behavior-driven testing) at its purest form. One of the popular frameworks that enable this is Cucumber which has a JavaScript flavor, see example below. Another similar yet different opportunity, StoryBook, allows exposing UI components as a graphic catalog where one can walk through the various states of each component (e.g. render a grid w/o filters, render that grid with multiple rows or with none, etc), see how it looks like, and how to trigger that state - this can appeal also to product folks but mostly serves as live doc for developers who consume those components.
โ Code Examples
๐ Doing It Right Example: Describing tests in human-language using cucumber-js
This is how one can describe tests using cucumber: plain language that allows anyone to understand and collaborate
Feature: Twitter new tweet
I want to tweet something in Twitter
@focus
Scenario: Tweeting from the home page
Given I open Twitter home
Given I click on "New tweet" button
Given I type "Hello followers!" in the textbox
Given I click on "Submit" button
Then I see message "Tweet saved"
๐ Doing It Right Example: Visualizing our components, their various states and inputs using Storybook
โช ๏ธ 3.11 Detect visual issues with automated tools
โ Do: Setup automated tools to capture UI screenshots when changes are presented and detect visual issues like content overlapping or breaking. This ensures that not only the right data is prepared but also the user can conveniently see it. This technique is not widely adopted, our testing mindset leans toward functional tests but it's the visuals what the user experience and with so many device types it's very easy to overlook some nasty UI bug. Some free tools can provide the basics - generate and save screenshots for the inspection of human eyes. While this approach might be sufficient for small apps, it's flawed as any other manual testing that demands human labor anytime something changes. On the other hand, it's quite challenging to detect UI issues automatically due to the lack of clear definition - this is where the field of 'Visual Regression' chime in and solve this puzzle by comparing old UI with the latest changes and detect differences. Some OSS/free tools can provide some of this functionality (e.g. wraith, PhantomCSS but might charge significant setup time. The commercial line of tools (e.g. Applitools, Percy.io) takes is a step further by smoothing the installation and packing advanced features like management UI, alerting, smart capturing by eliminating 'visual noise' (e.g. ads, animations) and even root cause analysis of the DOM/CSS changes that led to the issue
โ Code Examples
๐ Anti-Pattern Example: A typical visual regression - right content that is served badly
๐ Doing It Right Example: Configuring wraith to capture and compare UI snapshots
โ# Add as many domains as necessary. Key will act as a labelโ
domains:
english: "http://www.mysite.com"โ
โ# Type screen widths below, here are a couple of examplesโ
screen_widths:
- 600โ
- 768โ
- 1024โ
- 1280โ
โ# Type page URL paths below, here are a couple of examplesโ
paths:
about:
path: /about
selector: '.about'โ
subscribe:
selector: '.subscribe'โ
path: /subscribe
๐ Doing It Right Example: Using Applitools to get snapshot comparison and other advanced features
import * as todoPage from "../page-objects/todo-page";
describe("visual validation", () => {
before(() => todoPage.navigate());
beforeEach(() => cy.eyesOpen({ appName: "TAU TodoMVC" }));
afterEach(() => cy.eyesClose());
it("should look good", () => {
cy.eyesCheckWindow("empty todo list");
todoPage.addTodo("Clean room");
todoPage.addTodo("Learn javascript");
cy.eyesCheckWindow("two todos");
todoPage.toggleTodo(0);
cy.eyesCheckWindow("mark as completed");
});
});
Section 4๏ธโฃ: Measuring Test Effectiveness
โช ๏ธ 4.1 Get enough coverage for being confident, ~80% seems to be the lucky number
โ Do: The purpose of testing is to get enough confidence for moving fast, obviously the more code is tested the more confident the team can be. Coverage is a measure of how many code lines (and branches, statements, etc) are being reached by the tests. So how much is enough? 10โ30% is obviously too low to get any sense about the build correctness, on the other side 100% is very expensive and might shift your focus from the critical paths to the exotic corners of the code. The long answer is that it depends on many factors like the type of applicationโโโif youโre building the next generation of Airbus A380 than 100% is a must, for a cartoon pictures website 50% might be too much. Although most of the testing enthusiasts claim that the right coverage threshold is contextual, most of them also mention the number 80% as a thumb of a rule (Fowler: โin the upper 80s or 90sโ) that presumably should satisfy most of the applications.
Implementation tips: You may want to configure your continuous integration (CI) to have a coverage threshold (Jest link) and stop a build that doesnโt stand to this standard (itโs also possible to configure threshold per component, see code example below). On top of this, consider detecting build coverage decrease (when a newly committed code has less coverage)โโโthis will push developers raising or at least preserving the amount of tested code. All that said, coverage is only one measure, a quantitative based one, that is not enough to tell the robustness of your testing. And it can also be fooled as illustrated in the next bullets
โ Otherwise: Confidence and numbers go hand in hand, without really knowing that you tested most of the systemโโโthere will also be some fear and fear will slow you down
โ Code Examples
๐ Example: A typical coverage report
๐ Doing It Right Example: Setting up coverage per component (using Jest)
โช ๏ธ 4.2 Inspect coverage reports to detect untested areas and other oddities
โ
Do: Some issues sneak just under the radar and are really hard to find using traditional tools. These are not really bugs but more of surprising application behavior that might have a severe impact. For example, often some code areas are never or rarely being invokedโโโyou thought that the โPricingCalculatorโ class is always setting the product price but it turns out it is actually never invoked although we have 10000 products in DB and many salesโฆ Code coverage reports help you realize whether the application behaves the way you believe it does. Other than that, it can also highlight which types of code is not testedโโโbeing informed that 80% of the code is tested doesnโt tell whether the critical parts are covered. Generating reports is easyโโโjust run your app in production or during testing with coverage tracking and then see colorful reports that highlight how frequent each code area is invoked. If you take your time to glimpse into this dataโโโyou might find some gotchas
โ Otherwise: If you donโt know which parts of your code are left un-tested, you donโt know where the issues might come from
โ Code Examples
๐ Anti-Pattern Example: Whatโs wrong with this coverage report?
Based on a real-world scenario where we tracked our application usage in QA and find out interesting login patterns (Hint: the amount of login failures is non-proportional, something is clearly wrong. Finally it turned out that some frontend bug keeps hitting the backend login API)
โช ๏ธ 4.3 Measure logical coverage using mutation testing
Mutation-based testing is here to help by measuring the amount of code that was actually TESTED not just VISITED. Stryker is a JavaScript library for mutation testing and the implementation is really neat:
(1) it intentionally changes the code and โplants bugsโ. For example the code newOrder.price===0 becomes newOrder.price!=0. This โbugsโ are called mutations
(2) it runs the tests, if all succeed then we have a problemโโโthe tests didnโt serve their purpose of discovering bugs, the mutations are so-called survived. If the tests failed, then great, the mutations were killed.
Knowing that all or most of the mutations were killed gives much higher confidence than traditional coverage and the setup time is similar
โ Otherwise: Youโll be fooled to believe that 85% coverage means your test will detect bugs in 85% of your code
โ Code Examples
๐ Anti-Pattern Example: 100% coverage, 0% testing
function addNewOrder(newOrder) {
logger.log(`Adding new order ${newOrder}`);
DB.save(newOrder);
Mailer.sendMail(newOrder.assignee, `A new order was places ${newOrder}`);
return { approved: true };
}
it("Test addNewOrder, don't use such test names", () => {
addNewOrder({ assignee: "[email protected]", price: 120 });
}); //Triggers 100% code coverage, but it doesn't check anything
๐ Doing It Right Example: Stryker reports, a tool for mutation testing, detects and counts the amount of code that is not tested (Mutations)
โช ๏ธ4.4 Preventing test code issues with Test linters
โ Otherwise: Seeing 90% code coverage and 100% green tests will make your face wear a big smile only until you realize that many tests arenโt asserting for anything and many test suites were just skipped. Hopefully, you didnโt deploy anything based on this false observation
โ Code Examples
๐ Anti-Pattern Example: A test case full of errors, luckily all are caught by Linters
describe("Too short description", () => {
const userToken = userService.getDefaultToken() // *error:no-setup-in-describe, use hooks (sparingly) instead
it("Some description", () => {});//* error: valid-test-description. Must include the word "Should" + at least 5 words
});
it.skip("Test name", () => {// *error:no-skipped-tests, error:error:no-global-tests. Put tests only under describe or suite
expect("somevalue"); // error:no-assert
});
it("Test name", () => {// *error:no-identical-title. Assign unique titles to tests
});
Section 5๏ธโฃ: CI and Other Quality Measures
โช ๏ธ 5.1 Enrich your linters and abort builds that have linting issues
โ
Do: Linters are a free lunch, with 5 min setup you get for free an auto-pilot guarding your code and catching significant issue as you type. Gone are the days where linting was about cosmetics (no semi-colons!). Nowadays, Linters can catch severe issues like errors that are not thrown correctly and losing information. On top of your basic set of rules (like ESLint standard or Airbnb style), consider including some specializing Linters like eslint-plugin-chai-expect that can discover tests without assertions, eslint-plugin-promise can discover promises with no resolve (your code will never continue), eslint-plugin-security which can discover eager regex expressions that might get used for DOS attacks, and eslint-plugin-you-dont-need-lodash-underscore is capable of alarming when the code uses utility library methods that are part of the V8 core methods like Lodash._map(โฆ)
โ Code Examples
๐ Anti-Pattern Example: The wrong Error object is thrown mistakenly, no stack-trace will appear for this error. Luckily, ESLint catches the next production bug
โช ๏ธ 5.2 Shorten the feedback loop with local developer-CI
โ Do: Using a CI with shiny quality inspections like testing, linting, vulnerabilities check, etc? Help developers run this pipeline also locally to solicit instant feedback and shorten the feedback loop. Why? an efficient testing process constitutes many and iterative loops: (1) try-outs -> (2) feedback -> (3) refactor. The faster the feedback is, the more improvement iterations a developer can perform per-module and perfect the results. On the flip, when the feedback is late to come fewer improvement iterations could be packed into a single day, the team might already move forward to another topic/task/module and might not be up for refining that module.
Practically, some CI vendors (Example: CircleCI local CLI) allow running the pipeline locally. Some commercial tools like wallaby provide highly-valuable & testing insights as a developer prototype (no affiliation). Alternatively, you may just add npm script to package.json that runs all the quality commands (e.g. test, lint, vulnerabilities)โโโuse tools like concurrently for parallelization and non-zero exit code if one of the tools failed. Now the developer should just invoke one commandโโโe.g. โnpm run qualityโโโโto get instant feedback. Consider also aborting a commit if the quality check failed using a githook (husky can help)
โ Code Examples
๐ Doing It Right Example: npm scripts that perform code quality inspection, all are run in parallel on demand or when a developer is trying to push new code
{
"scripts": {
"inspect:sanity-testing": "mocha **/**--test.js --grep \"sanity\"",
"inspect:lint": "eslint .",
"inspect:vulnerabilities": "npm audit",
"inspect:license": "license-checker --failOn GPLv2",
"inspect:complexity": "plato .",
"inspect:all": "concurrently -c \"bgBlue.bold,bgMagenta.bold,yellow\" \"npm:inspect:quick-testing\" \"npm:inspect:lint\" \"npm:inspect:vulnerabilities\" \"npm:inspect:license\""
},
"husky": {
"hooks": {
"precommit": "npm run inspect:all",
"prepush": "npm run inspect:all"
}
}
}
โช ๏ธ5.3 Perform e2e testing over a true production-mirror
The huge Kubernetes ecosystem is yet to formalize a standard convenient tool for local and CI-mirroring though many new tools are launched frequently. One approach is running a โminimized-Kubernetesโ using tools like Minikube and MicroK8s which resemble the real thing only come with less overhead. Another approach is testing over a remote โreal-Kubernetesโ, some CI providers (e.g. Codefresh) has native integration with Kubernetes environment and make it easy to run the CI pipeline over the real thing, others allow custom scripting against a remote Kubernetes.
โ Otherwise: Using different technologies for production and testing demands maintaining two deployment models and keeps the developers and the ops team separated
โ Code Examples
(Credit: Dynamic-environments Kubernetes)
๐ Example: a CI pipeline that generates Kubernetes cluster on the flydeploy:
stage: deploy
image: registry.gitlab.com/gitlab-examples/kubernetes-deploy
script:
- ./configureCluster.sh $KUBE_CA_PEM_FILE $KUBE_URL $KUBE_TOKEN
- kubectl create ns $NAMESPACE
- kubectl create secret -n $NAMESPACE docker-registry gitlab-registry --docker-server="$CI_REGISTRY" --docker-username="$CI_REGISTRY_USER" --docker-password="$CI_REGISTRY_PASSWORD" --docker-email="$GITLAB_USER_EMAIL"
- mkdir .generated
- echo "$CI_BUILD_REF_NAME-$CI_BUILD_REF"
- sed -e "s/TAG/$CI_BUILD_REF_NAME-$CI_BUILD_REF/g" templates/deals.yaml | tee ".generated/deals.yaml"
- kubectl apply --namespace $NAMESPACE -f .generated/deals.yaml
- kubectl apply --namespace $NAMESPACE -f templates/my-sock-shop.yaml
environment:
name: test-for-ci
โช ๏ธ5.4 Parallelize test execution
โ Code Examples
๐ Doing It Right Example: Mocha parallel & Jest easily outrun the traditional Mocha thanks to testing parallelization (Credit: JavaScript Test-Runners Benchmark)
โช ๏ธ5.5 Stay away from legal issues using license and plagiarism check
โ Do: Licensing and plagiarism issues are probably not your main concern right now, but why not tick this box as well in 10 minutes? A bunch of npm packages like license check and plagiarism check (commercial with free plan) can be easily baked into your CI pipeline and inspect for sorrows like dependencies with restrictive licenses or code that was copy-pasted from Stack Overflow and apparently violates some copyrights
โ Otherwise: Unintentionally, developers might use packages with inappropriate licenses or copy paste commercial code and run into legal issues
โ Code Examples
๐ Doing It Right Example:
# install license-checker in your CI environment or also locally
npm install -g license-checker
# ask it to scan all licenses and fail with exit code other than 0 if it found unauthorized license. The CI system should catch this failure and stop the build
license-checker --summary --failOn BSD
โช ๏ธ5.6 Constantly inspect for vulnerable dependencies
โช ๏ธ5.7 Automate dependency updates
โ Do: Yarn and npm latest introduction of package-lock.json introduced a serious challenge (the road to hell is paved with good intentions)โโโby default now, packages are no longer getting updates. Even a team running many fresh deployments with โnpm installโ & โnpm updateโ wonโt get any new updates. This leads to subpar dependent packages versions at best or to vulnerable code at worst. Teams now rely on developers goodwill and memory to manually update the package.json or use tools like ncu manually. A more reliable way could be to automate the process of getting the most reliable dependency versions, though there are no silver bullet solutions yet there are two possible automation roads:
(1) CI can fail builds that have obsolete dependenciesโโโusing tools like โnpm outdatedโ or โnpm-check-updates (ncu)โ . Doing so will enforce developers to update dependencies.
(2) Use commercial tools that scan the code and automatically send pull requests with updated dependencies. One interesting question remaining is what should be the dependency update policyโโโupdating on every patch generates too many overhead, updating right when a major is released might point to an unstable version (many packages found vulnerable on the very first days after being released, see the eslint-scope incident).
An efficient update policy may allow some โvesting periodโโโโlet the code lag behind the @latest for some time and versions before considering the local copy as obsolete (e.g. local version is 1.3.1 and repository version is 1.3.8)
โ Code Examples
๐ Example: ncu can be used manually or within a CI pipeline to detect to which extent the code lag behind the latest versions
โช ๏ธ 5.8 Other, non-Node related, CI tips
โ Do: This post is focused on testing advice that is related to, or at least can be exemplified with Node JS. This bullet, however, groups few non-Node related tips that are well-known
- Use a declarative syntax. This is the only option for most vendors but older versions of Jenkins allows using code or UI
- Opt for a vendor that has native Docker support
- Fail early, run your fastest tests first. Create a โSmoke testingโ step/milestone that groups multiple fast inspections (e.g. linting, unit tests) and provide snappy feedback to the code committer
- Make it easy to skim-through all build artifacts including test reports, coverage reports, mutation reports, logs, etc
- Create multiple pipelines/jobs for each event, reuse steps between them. For example, configure a job for feature branch commits and a different one for master PR. Let each reuse logic using shared steps (most vendors provide some mechanism for code reuse)
- Never embed secrets in a job declaration, grab them from a secret store or from the jobโs configuration
- Explicitly bump version in a release build or at least ensure the developer did so
- Build only once and perform all the inspections over the single build artifact (e.g. Docker image)
- Test in an ephemeral environment that doesnโt drift state between builds. Caching node_modules might be the only exception
โ Otherwise: Youโll miss years of wisdom
โช ๏ธ 5.9 Build matrix: Run the same CI steps using multiple Node versions
โ
Do: Quality checking is about serendipity, the more ground you cover the luckier you get in detecting issues early. When developing reusable packages or running a multi-customer production with various configuration and Node versions, the CI must run the pipeline of tests over all the permutations of configurations. For example, assuming we use MySQL for some customers and Postgres for othersโโโsome CI vendors support a feature called โMatrixโ which allow running the suit of testing against all permutations of MySQL, Postgres and multiple Node version like 8, 9 and 10. This is done using configuration only without any additional effort (assuming you have testing or any other quality checks). Other CIs who doesnโt support Matrix might have extensions or tweaks to allow that
โ Code Examples
๐ Example: Using Travis (CI vendor) build definition to run the same test over multiple Node versions
language: node_js
node_js:
- "7"
- "6"
- "5"
- "4"
install:
- npm install
script:
- npm run test
Team
Yoni Goldberg
Role: Writer
About: I'm an independent consultant who works with Fortune 500 companies and garage startups on polishing their JS & Node.js applications. More than any other topic I'm fascinated by and aims to master the art of testing. I'm also the author of Node.js Best Practices
๐ Online Course: Liked this guide and wish to take your testing skills to the extreme? Consider visiting my comprehensive course Testing Node.js & JavaScript From A To Z
Follow:
Bruno Scheufler
Role: Tech reviewer and advisor
Took care to revise, improve, lint and polish all the texts
About: full-stack web engineer, Node.js & GraphQL enthusiast
Ido Richter
Role: Concept, design and great advice
About: A savvy frontend developer, CSS expert and emojis freak
Kyle Martin
Role: Helps keep this project running, and reviews security related practices
About: Loves working on Node.js projects and web application security.
โจ
Contributors Thanks goes to these wonderful people who have contributed to this repository!