diff --git a/middleware/README.md b/middleware/README.md index 135dee410..8bed5e6e2 100644 --- a/middleware/README.md +++ b/middleware/README.md @@ -117,6 +117,115 @@ Package expose following functions to process raw HTTP payloads: Also it is totally legit to use standard `Buffer` functions like `indexOf` for processing the HTTP payload. Just do not forget that if you modify the body, update the `Content-Length` header with a new value. And if you modify any of the headers, line endings should be `\r\n`. Rest is up to your imagination. +## Masking PII Data + +This middleware provides functionality to mask Personally Identifiable Information (PII) data in HTTP requests based on specified headers and JSON paths. It allows you to define a configuration object that specifies which headers and JSON fields should be masked and the type of data they represent. + +```javascript +const gor = require("goreplay_middleware"); +const faker = require("faker"); + +// Initialize the middleware +gor.init(); + +// Configuration for masking PII data +const maskConfig = { + headers: [ + { name: "Authorization", type: "token" }, + { name: "X-API-Key", type: "token" }, + { name: "X-User-Email", type: "email" }, + { name: "X-User-Name", type: "name" }, + ], + jsonPaths: [ + { path: "$.user.email", type: "email" }, + { path: "$.user.name", type: "name" }, + { path: "$.user.phone", type: "phone" }, + { path: "$.user.address", type: "address" }, + ], +}; + +// Function to mask a value based on its type +function maskValue(type) { + switch (type) { + case "email": + return faker.internet.email(); + case "name": + return faker.name.findName(); + case "phone": + return faker.phone.phoneNumber(); + case "address": + return faker.address.streetAddress(); + case "token": + return faker.random.alphaNumeric(32); + default: + return "***"; + } +} + +// Middleware function to mask PII data +gor.on("message", (data) => { + // Mask headers + maskConfig.headers.forEach((header) => { + const value = gor.httpHeader(data.http, header.name); + if (value) { + data.http = gor.setHttpHeader(data.http, header.name, maskValue(header.type)); + } + }); + + // Mask JSON fields + const body = gor.httpBody(data.http); + if (body) { + try { + const jsonBody = JSON.parse(body.toString()); + maskConfig.jsonPaths.forEach((field) => { + const value = eval(`jsonBody${field.path.slice(1)}`); + if (value) { + eval(`jsonBody${field.path.slice(1)} = maskValue(field.type)`); + } + }); + data.http = gor.setHttpBody(data.http, Buffer.from(JSON.stringify(jsonBody))); + } catch (error) { + console.error("Error parsing JSON body:", error); + } + } + + return data; +}); +``` + +### Configuration + +The `maskConfig` object is used to configure the masking behavior. It consists of two properties: + +- `headers`: An array of objects representing the headers to be masked. Each object should have the following properties: + - `name`: The name of the header. + - `type`: The type of data the header represents (e.g., "email", "name", "token"). + +- `jsonPaths`: An array of objects representing the JSON paths to be masked. Each object should have the following properties: + - `path`: The JSON path to the field to be masked (e.g., "$.user.email"). + - `type`: The type of data the field represents (e.g., "email", "name", "phone", "address"). + +### Masking Function + +The `maskValue` function is responsible for generating masked values based on the data type. It uses the Faker library to generate realistic-looking masked data for different types such as email, name, phone, address, and token. You can extend this function to support additional data types or customize the masking behavior. + +### Middleware Function + +The middleware function is triggered for each HTTP message (request or response) processed by GoReplay. It performs the following steps: + +1. Iterate over the specified headers in the `maskConfig` and mask their values using the `maskValue` function based on the associated data type. + +2. Parse the JSON body of the request (if present) and iterate over the specified JSON paths in the `maskConfig`. If a value exists at a given path, replace it with a masked value generated by the `maskValue` function based on the associated data type. + +3. Update the request body with the masked JSON data. + +4. Return the modified HTTP message. + +Note: The middleware uses the `eval` function to dynamically access and modify the JSON object based on the provided paths. Exercise caution when using `eval` and ensure that the paths are properly validated to prevent potential security risks. + +To use this middleware, make sure to install the required dependencies (`goreplay_middleware` and `faker`), configure the `maskConfig` object according to your needs, and run the middleware with GoReplay. + + ## Support Feel free to ask questions here and by sending email to [support@goreplay.org](mailto:support@goreplay.org). Commercial support is available and welcomed 🙈.